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MATERIALS AND METHODS FOR 
THE MODIFICATION OF PLANT LIGNIN CONTENT 

5 Technical Field of the Invention 

This invention relates to the field of modification of lignin content and 
composition in plants. More particularly, this invention relates to enzymes involved in 
the lignin biosynthetic pathway and nucleotide sequences encoding such enzymes. 

10 Background of the Invention 

Lignin is an insoluble polymer which is primarily responsible for the rigidity of 
. plant stems. Specifically, lignin serves as a matrix around the polysaccharide 
components of some plant cell walls. The higher the lignin content, the more rigid the 
plant. For example, tree species synthesize large quantities of lignin, with lignin 

15 constituting between 20% to 30% of the dry weight of wood. In addition to providing 
rigidity, lignin aids in water transport within plants by rendering cell walls hydrophobic 
and water impermeable. Lignin also plays a role in disease resistance of plants by 
impeding the penetration and propagation of pathogenic agents. 

The high concentration of lignin in trees presents a significant problem in the 

20 paper industry wherein considerable resources must be employed to separate lignin 
from the cellulose fiber needed for the production of paper. Methods typically 
employed for the removal of lignin are highly energy- and chemical-intensive, resulting 
in increased costs and increased levels of undesirable waste products. In the U.S. alone, 
about 20 million tons of lignin are removed from wood per year. 

25 Lignin is largely responsible for the digestibility, or lack thereof, of forage 

crops, with small increases in plant lignin content resulting in relatively high decreases 
in digestibility. For example, crops with reduced lignin content provide more efficient 
forage for cattle, with the yield of milk and meat being higher relative to the amount of 
forage crop consumed. During normal plant growth, the increase in dry matter content 

30 is accompanied by a corresponding decrease in digestibility. When deciding on the 
optimum time to harvest forage crops, farmers must therefore chose between a high 
yield of less digestible material and a lower yield of more digestible material. 

I 
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For some applications, an increase in lignin content is desirable since increasing 
the lignin content of a plant would lead to increased mechanical strength of wood, 
changes in its color and increased resistance to rot. Mycorrhizal species composition 
and abundance may also be favorably manipulated by modifying lignin content and 

5 structural composition. 

As discussed in detail below, lignin is formed by polymerization of at least three 
different monolignols which are synthesized in a multistep pathway, each step in the 
pathway being catalyzed by a different enzyme. It has been shown that manipulation. of 
the number of copies of genes encoding certain enzymes, such as cinnamyl alcohol 

10 dehydrogenase (CAD) and caffeic acid 3-O-methyltransferase (COMT) results in 
modification of the amount of lignin produced; see, for example, U.S. Patent No. 
5,451,5 14 and PCT publication no. WO 94/23044. Furthermore, it has been shown that 
antisense expression of sequences encoding CAD in poplar leads to the production of 
lignin having a modified composition (Grand, C. et al. Planta (Bed.) 163 :232-237 

15 (1985)). 

While DNA sequences encoding some of the enzymes involved in the lignin 
biosynthetic pathway have been isolated for certain species of plants, genes encoding 
many of the enzymes in a wide range of plant species have not yet been identified. 
Thus there remains a need in the art for materials useful in the modification of lignin 
20 content and composition in plants and for methods for their use. 

Summary of the Invention 

Briefly, the present invention provides isolated DNA sequences obtainable from 
eucalyptus and pine which encode enzymes involved in the lignin biosynthetic 
25 pathway, DNA constructs including such sequences, and methods for the use of such 
constructs. Transgenic plants having altered lignin content and composition are also 
provided. 

In a first aspect, the present invention provides isolated DNA sequences coding 
for the following enzymes isolated from eucalyptus and pine: cinnamate 4-hydroxylase 
30 (C4H), coumarate 3-hydroxylase (C3H), phenolase (PNL), O-methyl transferase 
(OMT), cinnamyl alcohol dehydrogenase (CAD), cinnamoyl-CoA reductase (CCR), 
phenylalanine ammonia-lyase (PAL), 4-coumarate:CoA ligase (4CL), coniferol 
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glucosyl transferase (CGT), coniferin &r/a-glucosidase (CBG), laccase (LAC) and 
peroxidase (POX), together with ferulate-5-hydroxylase (F5H) from eucalyptus. In one 
embodiment, the isolated DNA sequences comprise a nucleotide sequence selected 
from the group consisting of: (a) sequences recited in SEQ ID NO: 3. 13, 16-70, and 
5 72-88; (b) complements of the sequences recited in SEQ ID NO: 3. 13, 16-70. 72-88; 

(c) reverse complements of the sequences recited in SEQ ID NO: 3. 13, 16-70, 72-88; 

(d) reverse sequences of the sequences recited in SEQ ID NO: 3, 13, 16-70, 72-88; and 

(e) sequences having at least about a 99% probability of being the same as a sequence 
of (a) - (d) as measured by the computer algorithm FASTA. 

to In another aspect, the invention provides DNA constructs comprising a DNA 

sequence of the present invention, either alone, in combination with one or more of the 
inventive sequences or in combination with one or more known DNA sequences; 
together with transgenic cells comprising such constructs. 

In a related aspect, the present invention provides DNA constructs comprising, 

15 in the 5'-3* direction, a gene promoter sequence; an open reading frame coding for at 
least a functional portion of an enzyme encoded by the inventive DNA sequences or 
variants thereof; and a gene termination sequence. The open reading frame may be 
orientated in either a sense or antisense direction. DNA constructs comprising a non- 
coding region of a gene coding for an enzyme encoded by the above DNA sequences or 

20 a nucleotide sequence complementary to a non-coding region, together with a gene 
promoter sequence and a gene termination sequence, are also provided. Preferably, the 
gene promoter and termination sequences are functional in a host plant. Most 
preferably, the gene promoter and termination sequences are those of the original 
enzyme genes but others generally used in the an, such as the Cauliflower Mosaic 

25 Virus (CMV) promoter, with or without enhancers, such as the Kozak sequence or 
Omega enhancer, and Agrobacterium tumefaciens nopalin synthase terminator may be 
usefully employed in the present invention. Tissue-specific promoters may be 
employed in order to target expression to one or more desired tissues. In a preferred 
embodiment, the gene promoter sequence provides for transcription in xylem. The 

30 DNA construct may further include a marker for the identification of transformed cells. 
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In a further aspect, transgenic plant cells comprising the DNA constructs of the 
present invention are provided, together with plants comprising such transgenic cells, 
and fruits and seeds of such plants. 

In yet another aspect, methods for modulating the lignin content and 
5 composition of a plant are provided, such methods including stably incorporating into 
the genome of the plant a DNA construct of the present invention. In a preferred 
embodiment, the target plant is a woody plant, preferably selected from the group 
consisting of eucalyptus and pine species, most preferably from the group consisting of 
Eucalyptus grandis and Pinus radiata. In a related aspect, a method for producing a 
10 plant having altered lignin content is provided, the method comprising transforming a 
plant cell with a DNA construct of the present invention to provide a transgenic cell, 
and cultivating the transgenic cell under conditions conducive to regeneration and 
mature plant growth. 

In yet a further aspect, the present invention provides methods for modifying the 
1 5 activity of an enzyme in a plant, comprising stably incorporating into the genome of the 
plant a DNA construct of the present invention. In a preferred embodiment, the target 
plant is a woody plant, preferably selected from the group consisting of eucalyptus and 
pine species, most preferably from the group consisting of Eucalyptus grandis and 
Pinus radiata. 

20 The above-mentioned and additional features of the present invention and the 

manner of obtaining them will become apparent, and the invention will be best 
understood by reference to the following more detailed description, read in conjunction 
with the accompanying drawing. 

25 Brief Description of the Figures 

Fig. 1 is a schematic overview of the lignin biosynthetic pathway. 

Detailed Description 

Lignin is formed by polymerization of at least three different monolignols, 
30 primarily para-coumaryl alcohol, coniferyl alcohol and sinapyl alcohol. While these 
three types of lignin subunits are well known, it is possible that slightly different 
variants of these subunits may be involved in the lignin biosynthetic pathway in various 

4 
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plants. The relative concentration of these residues in lignin varies between different 
plant species and within species. In addition, the composition of lignin may also vary 
between different tissues within a specific plant. The three monolignols are derived 
from phenylalanine in a multistep process and are believed to be polymerized into 
5 lignin by a free radical mechanism. 

Fig. 1 shows the different steps in the biosynthetic pathway for coniferyl alcohol 
together with the enzymes responsible for catalyzing each step. /?ara-Coumaryi alcohol 
and sinapyl alcohol are synthesized by similar pathways. Phenylalanine is first 
deaminated by phenylalanine ammonia-lyase (PAL) to give cinnamate which is then 

10 hydroxy iated by cinnamate 4-hydroxylase (C4H) to form p-coumarate. p-Coumarate is 
hydroxylated by coumarate 3-hydroxylase to give caffeate. The newly added hydroxyl 
group is then methylated by O-methyi transferase (OMT) to give ferulate which is 
conjugated to coenzyme A by 4-coumarate:CoA ligase (4CL) to form feruloyl-CoA. 
Reduction of feruloyl-CoA to coniferaldehyde is catalyzed by cinnamoyl-CoA 

15 reductase (CCR). Coniferaldehyde is further reduced by the action of cinnamyl alcohol 
dehydrogenase (CAD) to give coniferyl alcohol which is then converted into its 
glucosylated form for export from the cytoplasm to the cell wall by coniferol glucosyl 
transferase (CGT). Following export, the de-glucosylated form of coniferyl alcohol is 
obtained by the action of coniferin 6e/a-glucosidase (CBG). Finally, polymerization of 

20 the three monolignols to provide lignin is catalyzed by phenolase (PNL), laccase (LAC) 
and peroxidase (POX). 

The formation of sinapyl alcohol involves an additional enzyme, ferulate-5- 
hydroxylase (F5H). For a more detailed review of the lignin biosynthetic pathway, see: 
Whetton, R. and Sederoff, R., The Plant CelL 2:1001-1013 (1995). 

25 Quantitative and qualitative modifications in plant lignin content are known to 

be induced by external factors such as light stimulation, low calcium levels and 
mechanical stress. Synthesis of new types of lignins, sometimes in tissues not normally 
lignified, can also be induced by infection with pathogens. In addition to lignin, several 
other classes of plant products are derived from phenylalanine, including flavonoids, 

30 coumarins, stilbenes and benzoic acid derivatives, with the initial steps in the synthesis 
of all these compounds being the same. Thus modification of the action of PAL, C4H 
and 4CL may affect the synthesis of other plant products in addition to lignin. 
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Using the methods and materials of the present invention, the lignin content of a 
plant can be increased by incorporating additional copies of genes encoding enzymes 
involved in the lignin biosynthetic pathway into the genome of the target plant. 
Similarly, a decrease in lignin content can be obtained by transforming the target plant 
5 with antisense copies of such genes. In addition, the number of copies of genes 
encoding for different enzymes in the lignin biosynthetic pathway can be manipulated 
to modify the relative amount of each monolignol synthesized, thereby leading to the 
formation of lignin having altered composition. The alteration of lignin composition 
would be advantageous, for example, in tree processing for paper, and may also be 
10 effective in altering the payability of wood materials to rotting fungi. 

In one embodiment, the present invention provides isolated complete or partial 
DNA sequences encoding, or partially encoding, enzymes involved in the lignin 
biosynthetic pathway, the DNA sequences being obtainable from eucalyptus and pine. 
Specifically, the present invention provides isolated DNA sequences encoding the 
15 enzymes CAD (SEQ ID NO: 1, 30), PAL (SEQ ID NO: 16), C4H (SEQ ID NO: 17), 
C3H (SEQ ID NO: 18). F5H (SEQ ID NO: 19-21), OMT (SEQ ID NO: 22-25), CCR 
(SEQ ID NO: 26-29), CGT (SEQ ID NO: 31-33), CBG (SEQ ID NO: 34), PNL (SEQ 
ID NO: 35, 36), LAC (SEQ ID NO: 37-41) and POX (SEQ ID NO: 42-44) from 
Eucalyptus grandis; and the enzymes C4H (SEQ ID NO: 2, 3, 48, 49), C3H (SEQ ID 
20 NO: 4, 50-52), PNL (SEQ ID NO: 5,81), OMT (SEQ ID NO: 6, 53-55), CAD (SEQ ID 
NO: 7, 71), CCR (SEQ ID NO: 8, 58-70), PAL (SEQ ID NO: 9-1 1,45-47), 4CL (SEQ 
ID NO: 12, 56, 57), CGT (SEQ ID NO: 72), CBG (SEQ ID NO: 73-80), LAC (SEQ ID 
NO: 82-84) and POX (SEQ ID NO: 13, 85-88) from Pinus radiata. Complements of 
such isolated DNA sequences, reverse complements of such isolated DNA sequences 
25 and reverse sequences of such isolated DNA sequences, together with variants of such 
sequences, are also provided. DNA sequences encompassed by the present invention 
include cDNA, genomic DNA, recombinant DNA and wholly or partially chemically 
synthesized DNA molecules. 

The definition of the terms "complement", "reverse complement" and "reverse 
30 sequence", as used herein, is best illustrated by the following example. For the 
sequence 5' AGGACC 3', the complement, reverse complement and reverse sequence 
are as follows: 
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complement 3' TCCTGG 5* 

reverse complement 3' GGTCCT 5' 

reverse sequence 5' CCAGGA 3'. 

As used herein, the term "variant" covers any sequence which exhibits at least 
about 50%. more preferably at least about 70% and, more preferably yet, at least 
about 90% identity to a sequence of the present invention. Most preferably, a 
"variant" is any sequence which has at least about a 99% probability of being the 
same as the inventive sequence. The probability for DNA sequences is measured by 
the computer algorithm FASTA (version 2.0u4 t February 1996; Pearson W. R. et aL 
Proc. Natl. Acad. Sci .. 25:2444-2448, 1988), the probability for translated DNA 
sequences is measured by the computer algorithm TBLASTX and that for protein 
sequences is measured by the computer algorithm BLASTP (Altschul, S. F. et al. J, 
Mol. Biol .. 215:403-410. 1990). The term "variants" thus encompasses sequences 
wherein the probability of finding a match by chance (smallest sum probability) in a 
database, is less than about 1% as measured by any of the above tests. 

Variants of the isolated sequences from other eucalyptus and pine species, as 
well as from other commercially important species utilized by the lumber industry, 
are contemplated. These include the following gymnosperms, by way of example: 
loblolly pine Pinus taeda, slash pine Pinus elliotti, sand pine Pinus clausa, longieaf pine 
Pinus palustrus, shortleaf pine Pinus echinata, ponderosa pine Pinus ponderosa, Jeffrey 
pine Pinus Jeffrey, red pine Pinus resinosa, pitch pine Pinus rigida, jack pine Pinus 
banksiana, pond pine Pinus serotina, Eastern white pine Pinus strobus, Western white 
pine Pinus monticola, sugar pine Pinus lambertiana, Virginia pine Pinus virginiana, 
lodgepole pine Pinus contorta, Caribbean pine Pinus caribaea, P. pinaster, Calabrian 
pine P. brutia, Afghan pine P. eldarica, Coulter pine P. coulteri, European pine P. 
nigra and P. sylvestris; Douglas-fir Pseudotsuga menziesii\ the hemlocks which include 
Western hemlock Tsuga heterophylla, Eastern hemlock Tsuga canadensis, Mountain 
hemlock Tsuga mertensiana; the spruces which include the Norway spruce Picea abies, 
red spruce Picea rubens, white spruce Picea glauca, black spruce Picea mariana, Sitka 
spruce Picea sitchensis, Englemann spruce Picea engelmanni, and blue spruce Picea 
pungens; redwood Sequoia sempervirens; the true firs include the Alpine fir Abies 
lasiocarpa, silver fir Abies amabilis, grand fir Abies grandis* noble fir Abies procera* 
white fir Abies concolor, California red fir Abies magnifica, and balsam fir Abies 
balsamea, the cedars which include the Western red cedar Thuja plicata, incense 
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cedar libocedrus decurrens. Northern white cedar Thuja occidentalis. Port Orford cedar 
Chamaecyparis tawsoniona, Atlantic white cedar Ckamaecyparis thyoides, Alaska 
yellow-cedar Chamaecyparis nootkatensis. and Eastern red cedar Huniperus virginiana: 
the larches which include Eastern larch Larix laricina. Western larch Larix 
occidentalis, European larch Larix decidua, Japanese larch Larix leptolepis, and 
Siberian larch Larix siberica; bold cypress Taxodium distichum and Giant sequoia 
Sequoia gigantea; 

and the following angiosperms, by way of example: 

Eucalyptus alba. E. bancroftii, E. botyroides. E. bridgesiana, E. calophylla. E. 
camaldulensis. £. citriodora, E. cladocalyx, E. coccifera, E. curtisii, E. dalrympleana. E. 
deglupta. E. delagatensis, E. diversicolor, E. dunniL E. ficifolia, E. globulus, E. 
gomphocephala. E. gunnii, E. henryi, E. laevopinea, E. macarthurii, E. macrorhyncha. 
E. maculata. E. marginata, E. megacarpa. £. melliodora. E. nicholii, E. nitens, E. nova- 
anglica. E. obliqua, E. obtusiflora, E. oreades, E. pauciflora. E. polybractea. E. regnans, 
E. resinifera, E. robusta, E. rudis, E. saligna, E. sideroxylon, E. stuartiana. E. tereticornis. 
E. torelliana. E. umigera, E. urophylla, E. viminalis, E. viridis, E. wandoo and £. 
youmanni. 

The inventive DNA sequences may be isolated by high throughput sequencing 
of cDNA libraries such as those prepared from Eucalyptus grandis and Pinus radiata 
as described below in Examples 1 and 2. Alternatively, oligonucleotide probes based 
on the sequences provided in SEQ ID NO: 1-13 and 16-88 can be synthesized and 
used to identify positive clones in either cDNA or genomic DNA libraries from 
Eucalyptus grandis and Pinus radiata, or from other gymnosperms and angiosperms 
including those identified above, by means of hybridization or PCR techniques. 
Probes can be shorter than the sequences provided herein but should be at least 
about 10, preferably at least about 15 and most preferably at least about 20 
nucleotides in length. Hybridization and PCR techniques suitable for use with such 
oligonucleotide probes are well known in the art. Positive clones may be analyzed 
by restriction enzyme digestion, DNA sequencing or the like. 
) In addition, the DNA sequences of the present invention may be generated by 

synthetic means using techniques well known in the art. Equipment for automated 
synthesis of oligonucleotides is commercially available from suppliers such as Perkin 
Elmer/ Applied Biosystems Division (Foster City, CA) and may be operated according 
to the manufacturer's instructions. 
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In one embodiment, the DNA constructs of the present invention include an 
open reading frame coding for at least a functional portion of an enzyme encoded by a 
nucleotide sequence of the present invention or a variant thereof. As used herein, the 
"functional portion" of an enzyme is that portion which contains the active site 
5 essential for affecting the metabolic step, i.e. the portion of the molecule that is capable 
of binding one or more reactants or is capable of improving or regulating the rate of 
reaction. The active site may be made up of separate portions present on one or more 
polypeptide chains and will generally exhibit high substrate specificity. The term 
"enzyme encoded by a nucleotide sequence" as used herein, includes enzymes encoded 
10 by a nucleotide sequence which includes the partial isolated DNA sequences of the 
present invention. 

For applications where amplification of lignin synthesis is desired, the open 
reading frame is inserted in the DNA construct in a sense orientation, such that 
transformation of a target plant with the DNA construct will lead to an increase in the 

15 number of copies of the gene and therefore an increase in the amount of enzyme. When 
down-regulation of lignin synthesis is desired, the open reading frame is inserted in the 
DNA construct in an antisense orientation, such that the RNA produced by transcription 
of the DNA sequence is complementary to the endogenous mRNA sequence. This, in 
turn, will result in a decrease in the number of copies of the gene and therefore a 

20 decrease in the amount of enzyme. Alternatively, regulation can be achieved by 
inserting appropriate sequences or subsequences (e.g. DNA or RNA) in ribozyme 
constructs. 

In a second embodiment, the inventive DNA constructs comprise a nucleotide 
sequence including a non-coding region of a gene coding for an enzyme encoded by a 

25 DNA sequence of the present invention, or a nucleotide sequence complementary to 
such a non-coding region. As used herein the term "non-coding region" includes both 
transcribed sequences which are not translated, and non-transcribed sequences within 
about 2000 base pairs 5' or 3' of the translated sequences or open reading frames. 
Examples of non-coding regions which may be usefully employed in the inventive 

30 constructs include introns and 5 '-non-coding leader sequences. Transformation of a 
target plant with such a DNA construct may lead to a reduction in the amount of lignin 
synthesized by the plant by the process of cosuppression. in a manner similar to that 

9 
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discussed, for example, by Napoli et al. (Plant Cell 2:279-290, 1990) and de Carvalho 
Niebel et al. ( Plant Cell 7:347-358. 1995). 

The DNA constructs of the present invention further comprise a gene promoter 
sequence and a gene termination sequence, operably linked to the DNA sequence to be 
transcribed, which control expression of the gene. The gene promoter sequence is 
generally positioned at the 5' end of the DNA sequence to be transcribed, and is 
employed to initiate transcription of the DNA sequence. Gene promoter sequences are 
generally found in the 5' non-coding region of a gene but they may exist in introns 
(Luehrsen, K. R., Mol. Gen. Genet . 225:81-93, 1991) or in the coding region, as for 
example in PAL of tomato (Bloksberg, 1991. Studies on the Biology of Phenylalanine 
Ammonia Lyase and Plant Pathogen Interaction. Ph.D. Thesis. Univ. of California. 
Davis, University Microfilms International order number 9217564). When the 
construct includes an open reading frame in a sense orientation, the gene promoter 
sequence also initiates translation of the open reading frame. For DNA constructs 
comprising either an open reading frame in an antisense orientation or a non-coding 
region, the gene promoter sequence consists only of a transcription initiation site having 

a RNA polymerase binding site. 

A variety of gene promoter sequences which may be usefully employed in the 
DNA constructs of the present invention are well known in the art. The promoter gene 
sequence, and also the gene termination sequence, may be endogenous to the target 
plant host or may be exogenous, provided the promoter is functional in the target host. 
For example, the promoter and termination sequences may be from other plant species, 
plant viruses, bacterial plasmids and the like. Preferably, gene promoter and 
termination sequences are from the inventive sequences themselves. 

Factors influencing the choice of promoter include the desired tissue specificity 
of the construct, and the timing of transcription and translation. For example, 
constitutive promoters, such as the 35S Cauliflower Mosaic Virus (CaMV 35S) 
promoter, will affect the activity of the enzyme in all parts of the plant. Use of a tissue 
specific promoter will result in production of the desired sense or antisense RNA only 
in the tissue of interest. With DNA constructs employing inducible gene promoter 
sequences, the rate of RNA polymerase binding and initiation can be modulated by 
external stimuli, such as light, heat, anaerobic stress, alteration in nutrient conditions 
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and the like. Temporally regulated promoters can be employed to effect modulation of 
the rate of RNA polymerase binding and initiation at a specific time during 
development of a transformed cell. Preferably, the original promoters from the enzyme 
gene in question, or promoters from a specific tissue-targeted gene in the organism to 
5 be transformed, such as eucalyptus or pine are used. Other examples of gene promoters 
which may be usefully employed in the present invention include, mannopine synthase 
(mas), octopine synthase (ocs) and those reviewed by Chua et al. ( Science . 244 :174- 
181. 1989). 

The gene termination sequence, which is located 3* to the DNA sequence to be 

10 transcribed, may come from the same gene as the gene promoter sequence or may be 
from a different gene. Many gene termination sequences known in the an may be 
usefully employed in the present invention, such as the 3' end of the Agrobacterium 
tumefaciens nopaline synthase gene. However, preferred gene terminator sequences are 
those from the original enzyme gene or from the target species to be transformed. 

15 The DNA constructs of the present invention may also contain a selection 

marker that is effective in plant cells, to allow for the detection of transformed cells 
containing the inventive construct. Such markers, which are well known in the art, 
typically confer resistance to one or more toxins. One example of such a marker is the 
NPTII gene whose expression results in resistance to kanamycin or hygromycin, 

20 antibiotics which is usually toxic to plant cells at a moderate concentration (Rogers et 
al. in Methods for Plant Molecular Biology . A. Weissbach and H. Weissbach. eds.. 
Academic Press Inc., San Diego, CA (1988)). Alternatively, the presence of the desired 
construct in transformed cells can be determined by means of other techniques well 
known in the art, such as Southern and Western blots. 

25 Techniques for operatively linking the components of the inventive DNA 

constructs are well known in the art and include the use of synthetic linkers containing 
one or more restriction endonuclease sites as described, for example, by Maniatis et al., 
{Molecular Cloning; A Laboratory Manual, Cold Spring Harbor Laboratories, Cold 
Spring Harbor, NY, 1989). The DNA construct of the present invention may be linked 

30 to a vector having at least one replication system, for example, E. coli, whereby after 
each manipulation, the resulting construct can be cloned and sequenced and the 
correctness of the manipulation determined. 
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The DNA constructs of the present invention may be used to transform a variety 
of plants, both monocotyledonous (e.g. grasses, com, grains, oat, wheat and barley), 
dicotyledonous (e.g. Arabidopsis, tobacco, legumes, alfalfa, oaks, eucalyptus, maple), 
and Gymnosperms (e.g. Scots pine (Aronen. Finnish Forest Res. Papers, vol. 595, 
5 1996), white spruce (Ellis et a!.. Bioigchnojogv U:94-92, 1993). larch (Huang et al., In 
Vitro Cell 22:201-207, 1991). In a preferred embodiment, the inventive DNA 
constructs are employed to transform woody plants, herein defined as a tree or shrub 
whose stem lives for a number of years and increases in diameter each year by the 
addition of woody tissue. Preferably the target plant is selected from the group 
,0 consisting of eucalyptus and pine species, most preferably from the group consisting of 
Eucalyptus grandis and Pinus radiata. As discussed above, transformation of a plant 
with a DNA construct including an open reading frame coding for an enzyme encoded 
by an inventive DNA sequence wherein the open reading frame is orientated in a sense 
direction will lead to an increase in lignin content of the plant or, in some cases, to a 
15 decrease by cosuppression. Transformation of a plant with a DNA construct 
comprising an open reading frame in an antisense orientation or a non-coding 
(untranslated) region of a gene will lead to a decrease in the lignin content of the 
transformed plant. 

Techniques for stably incorporating DNA constructs into the genome of target 
20 plants are well known in the art and include Agrobacterium tumefaciem mediated 
introduction, electroporation, protoplast fusion, injection into reproductive organs, 
injection into immature embryos, high velocity projectile introduction and the like. The 
choice of technique will depend upon the target plant to be transformed. For example, 
dicotyledonous plants and certain monocots and gymnosperms may be transformed by 
25 Agrobactcrium Ti plasmid technology, as described, for example by Bevan (Nucl. Acid 
Res. 12:871 1-8721, 1984). Targets for the introduction of the DNA constructs of the 
present invention include tissues, such as leaf tissue, disseminated cells, protoplasts, 
seeds, embryos, meristematic regions; cotyledons, hypocotyls, and the like. One 
preferred method for transforming eucalyptus and pine is a biolistic method using 
30 pollen (see, for example, Aronen 1996, Finnish Forest Res. Papers vol. 595, 53pp) or 
easily regenerable embryonic tissues. Other transformation techniques which may be 
usefully employed in the inventive methods include those taught by Ellis et al. (Plant 
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Cell Reports . 8:16-20, 1989), Wilson et al. ( Plant Cell Reports 7:704-707, 1989) and 
Tautorus et al. ( Theor. ApoI. Genet . 78:531-536, 1989). 

Once the cells are transformed, cells having the inventive DNA construct 
incorporated in their genome may be selected by means of a marker, such as the 
5 kanamycin resistance marker discussed above. Transgenic cells may then be cultured 
in an appropriate medium to regenerate whole plants, using techniques well known in 
the art. In the case of protoplasts, the cell wall is allowed to reform under appropriate 
osmotic conditions. In the case of seeds or embryos, an appropriate germination or 
callus initiation medium is employed. For explants, an appropriate regeneration 
io medium is used. Regeneration of plants is well established for many species. For a 
review of regeneration of forest trees see Dunstan et al., Somatic embryogenesis in 
woody piants. In: Thorpe, T.A. ed., 1995: in vitro embryogenesis of plants. Vol. 20 in 
Current Plant Science and Biotechnology in Agriculture, Chapter 12, pp. 471-540. 
Specific protocols for the regeneration of spruce are discussed by Roberts et al., 
15 (Somatic Embryogenesis of Spruce. In: Synseed. Applications of synthetic seed to crop 
improvement. Redenbaugh, K., ed. CRC Press, Chapter 23, pp. 427-449, 1993). The 
resulting transformed plants may be reproduced sexually or asexually, using methods 
well known in the art, to give successive generations of transgenic plants. 

As discussed above, the production of RNA in target plant cells can be 
20 controlled by choice of the promoter sequence, or by selecting the number of functional 
copies or the site of integration of the DNA sequences incorporated into the genome of 
the target plant host. A target plant may be transformed with more than one DNA 
construct of the present invention, thereby modulating the lignin biosynthetic pathway 
for the activity of more than one enzyme, affecting enzyme activity in more than one 
25 tissue or affecting enzyme activity at more than one expression time. Similarly, a DNA 
construct may be assembled containing more than one open reading frame coding for 
an enzyme encoded by a DNA sequence of the present invention or more than one non- 
coding region of a gene coding for such an enzyme. The DNA sequences of the present 
inventive may also be employed in combination with other known sequences encoding 
30 enzymes involved in the lignin biosynthetic pathway. In this manner, it may be 
possible to add a lignin biosynthetic pathway to a non-woody plant to produce a new 
woody plant. 
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The isolated DNA sequences of the present invention may also be employed as 
probes to isolate DNA sequences encoding enzymes involved in the lignin synthetic 
pathway from other plant species, using techniques well known to those of skill in the 
art. 

5 The following examples are offered by way of illustration and not by way of 

limitation. 

Example 1 

Isolation and Characterization of cDNA Clones from Eucalyptus erandis 

10 Two Eucalyptus grandis cDNA expression libraries (one from a mixture of 

various tissues from a single tree and one from leaves of a single tree) were constructed 
and screened as follows. 

mRNA was extracted from the plant tissue using the protocol of Chang et al . 
(Plant Molecular Biology Reporter 11:113-116 (1993)) with minor modifications. 

15 Specifically, samples were dissolved in CPC-RNAXB (100 mM Tris-Cl, pH 8,0; 25 
mM EDTA; 2.0 M NaCl; 2%CTAB; 2% PVP and 0.05% Spermidine*3 HCl)and 
extracted with Chlorofomrisoamyl alcohol, 24: 1. mRNA was precipitated with ethanol 
and the total RNA preparate was purified using a Poly(A) Quik mRNA Isolation Kit 
(Stratagene, La Jolla, CA). A cDNA expression library was constructed from the 

20 purified mRNA by reverse transcriptase synthesis followed by insertion of the resulting 
cDNA clones in Lambda ZAP using a ZAP Express cDNA Synthesis Kit (Stratagene), 
according to the manufacturer's protocol. The resulting cDNAs were packaged using a 
Gigapack II Packaging Extract (Stratagene) employing 1 \x\ of sample DNA from the 5 
\x\ ligation mix. Mass excision of the library was done using XL 1 -Blue MRF' cells and 

25 XLOLR cells (Stratagene) with ExAssist helper phage (Stratagene). The excised 
phagemids were diluted with NZY broth (Gibco BRL, Gaithersburg, MD) and plated 
out onto LB-kanamycin agar plates containing X-gal and isopropylthio-beta-galactoside 
(IPTG). 

Of the colonies plated and picked for DNA miniprep, 99% contained an insert 
30 suitable for sequencing. Positive colonies were cultured in NZY broth with kanamycin 
and cDNA was purified by means of alkaline lysis and polyethylene glycol (PEG) 
precipitation. Agarose gel at 1% was used to screen sequencing templates for 
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chromosomal contamination. Dye primer sequences were prepared using a Turbo 
Catalyst 800 machine (Perkin Elmer/Applied Biosystems, Foster City, CA) according 
to the manufacturer's protocol. 

DNA sequence for positive clones was obtained using an Applied Biosystems 

5 Prism 377 sequencer. cDNA clones were sequenced first from both the 5' end and, in 
some cases, also from the 3' end. For some clones, internal sequence was obtained 
using subcloned fragments. Subcloning was performed using standard procedures of 
restriction mapping and subcloning to pBluescript II SK+ vector. 

The determined cDNA sequence was compared to known sequences in the 

10 EMBL database (release 46, March 1996) using the FASTA algorithm of February 
1996 (version 2.0u4) (available on the Internet at the ftp site 
ftp://ftp.virginiavedu/pub/fasta/). Multiple alignments of redundant sequences were 
used to build up reliable consensus sequences. Based on similarity to known sequences 
from other plant species, the isolated DNA sequence (SEQ ID NO: 1) was identified as 

15 encoding a CAD enzyme. 

In further studies, using the procedure described above, cDNA sequences 
encoding the following Eucalyptus grandis enzymes were isolated: PAL (SEQ ID NO: 
16); C4H (SEQ ID NO: 17); C3H (SEQ ID NO: 18); F5H (SEQ ID NO: 19-21); OMT 
(SEQ ID NO: 22-25); CCR (SEQ ID NO: 26-29); CAD (SEQ ID NO: 30); CGT (SEQ 

20 ID NO: 31-33); CBG (SEQ ID NO: 34); PNL (SEQ ID NO: 35, 36); LAC (SEQ ID 
NO: 37-41); and POX (SEQ ID NO: 42-44). 

Example 2 

Isolation and Characterization of cDNA Clones from Pinus radiata 

25 

a) Isolation of cDNA clones bv high through-put screening 

A Pinus radiata cDNA expression library was constructed from xyiem and 
screened as described above in Example 1. DNA sequence for positive clones was 
obtained using forward and reverse primers on an Applied Biosystems Prism 377 
30 sequencer and the determined sequences were compared to known sequences in the 
database as described above. 
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Based on similarity to known sequences from other plant species, the isolated 
DNA sequences were identified as encoding the enzymes C4H (SEQ ID NO: 2 and 3), 
C3H (SEQ ID NO: 4), PNL (SEQ ID NO: 5), OMT (SEQ ID NO: 6), CAD (SEQ ID 
NO: 7), CCR (SEQ ID NO: 8), PAL (SEQ ID NO: 9-1 1) and 4CL (SEQ ID NO: 12). 
5 In further studies, using the procedure described above, additional cDNA clones 

encoding the following Pinus radiata enzymes were isolated: PAL (SEQ ID NO: 45- 
47); C4H (SEQ ID NO: 48, 49); C3H (SEQ ID NO: 50-52); OMT (SEQ ID NO: 53- 
55); 4CL (SEQ ID NO: 56, 57); CCR (SEQ ID NO: 58-70); CAD (SEQ ID NO: 71); 
CGT (SEQ ID NO: 72); CBG (SEQ ID NO: 73-80); PNL (SEQ ID NO: 81); LAC 
10 (SEQ ID NO: 82-84); and POX (SEQ ID NO: 85-88). 

h) Knlation of cDN A r.lones bv PCR 

Two PCR probes, hereinafter referred to as LNB0 1 0 and LNB0 1 1 (SEQ ID NO: 
14 and 15, respectively) were designed based on conserved domains in the following 
l5 peroxidase sequences previously identified in other species: vanpox, hvupox6, taepox, 
hvupoxl, osapox, ntopox2, ntopoxl, lespox, pokpox, luspox, athpox, hrpox, spopox, 
and tvepox (Genbank accession nos. DU337, M83671. X56011, X58396, X66125, 
J02979, D11396, X71593, Dll 102, L07554, M58381, X57564, Z22920, and Z31011. 
respectively). 

20 RNA was isolated from pine xylem and first strand cDNA was synthesized as 

described above. This cDNA was subjected to PCR using 4 uM LNB010, 4 uM 
LNB011, 1 x Kogen's buffer, 0.1 mg/ml BSA, 200 mM dNTP, 2 mM Mg 2 *, and 0.1 
U/ul of Taq polymerase (Gibco BRL). Conditions were 2 cycles of 2 min at 94 °C, 1 
min at 55 °C and 1 min at 72 "C; 25 cycles of 1 min at 94 °C, 1 min at 55 "C, and 1 min 
25 at 72 'C; and 18 cycles of 1 min at 94 "C, 1 min at 55 "C, and 3 min at 72 °C in a 
Stratagene Robocycler. The gene was re-amplified in the same manner. A band of 
about 200 bp was purified from a TAE agarose gel using a Schleicher & Schuell Elu- 
Quik DNA purification kit and clones into a T-tailed pBluescript vector (Marchuk D. et 
al., Mnrleir. Acids Res . 19:1154, 1991). Based on similarity to known sequences, the 
30 isolated gene (SEQ ID NO: 1 3) was identified as encoding pine peroxidase (POX). 
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Example 3 

Use of an O-methvltransferase fOMT> Gene to Modify Lienin Biosynthesis 

5 a) Transformation of tobacco plants with a Pinus radiata OMT gene 

Sense and anti-sense constructs containing a sequence including the coding 
region of OMT (SEQ ID NO: 53) from Pinus radiata were inserted into Agrobacterium 
tumefaciens LBA4301 (provided as a gift by Dr. C. Kado, University of California, 
Davis, CA) by direct transformation using published methods (see, An G, Ebert PR, 

10 Mttra A, Ha SB: Binary Vectors. In: Gelvin SB, Schilperoort RA (eds) Plant 
Molecular Biology Manual, Kluwer Academic Publishers, Dordrecht (1988)). The 
presence and integrity of the transgenic constructs were verified by restriction digestion 
and DNA sequencing. 

Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed using 

15 the method of Horsch et al. (Science, 227:1229-1231, 1985). Five independent 
transformed plant lines were established for the sense construct and eight independent 
transformed plant lines were established for the anti-sense construct for OMT. 
Transformed plants containing the appropriate lignin gene construct were verified using 
Southern blot experiments. A "+" in the column labeled "Southern" in Table 1 below 

20 indicates that the transformed plant lines were confirmed as independent transformed 
lines. 

b) Expression of Pinus OMT in transformed plants 

Total RNA was isolated from each independent transformed plant line created 

25 with the OMT sense and anti-sense constructs. The RNA samples were analysed in 
Northern blot experiments to determine the level of expression of the transgene in each 
transformed line. The data shown in the column labeled "Northern" in Table 1 shows 
that the transformed plant lines containing the sense and anti-sense constructs for OMT 
all exhibited high levels of expression, relative to the background on the Northern blots. 

30 OMT expression in sense plant line number 2 was not measured because the RNA 
sample showed signs of degradation. There was no detectable hybridisation to RNA 
samples from empty vector-transformed control plants. 

17 

BNSDOCID: <WO 981 1205A2_I_> 



WO 98/11205 



PCT/NZ97/00112 



r) Modulation of OMT enzyme activity in transformed plants 

The total activity of OMT enzyme, encoded by the Pinus OMT gene and by the 
endogenous tobacco OMT gene, in transformed tobacco plants was analysed for each 
transformed plant line created with the OMT sense and anti-sense constructs. Crude 

5 protein extracts were prepared from each transformed plant and assayed using the 
method of Zhang et al. (Plant PhvsioL Ul-65-74, 1997). The data contained in the 
column labeled "Enzyme" in Table 1 shows that the transformed plant lines containing 
the OMT sense construct generally had elevated OMT enzyme activity, with a 
maximum of 199%, whereas the transformed plant lines containing the OMT anti-sense 

10 construct generally had reduced OMT enzyme activity, with a minimum of 35%, 
relative to empty vector-transformed control plants. OMT enzyme activity was not 
estimated in sense plant line number 3. 

H ) Effects of Pin»s OMT on Hp nin concen tration in transformed plants 
15 The concentration of lignin in the transformed tobacco plants was determined 

using the well-established procedure of thioglycolic acid extraction (see, Freudenberg 
et al. in "Constitution and Biosynthesis of Lignin", Springer-Verlag, Berlin, 1968). 
Briefly, whole tobacco plants, of an average age of 38 days, were frozen in liquid 
nitrogen and ground to a fine powder in a mortar and pestle. 100 mg of frozen powder 
20 from one empty vector-transformed control plant line, the five independent transformed 
plant lines containing the sense construct for OMT and the eight independent 
transformed plant lines containing the anti-sense construct for OMT were extracted 
individually with methanol, followed by 10% thioglycolic acid and finally dissolved in 
1 M NaOH. The final extracts were assayed for absorbance at 280 ran. The data shown 
25 in the column labelled "TGA" in Table 1 shows that the transformed plant lines 
containing the sense and the anti-sense OMT gene constructs all exhibited significantly 
decreased levels of lignin, relative to the empty vector-transformed control plant lines. 



WO 98/11205 



PCT/NZ97/00112 



Table 1 



plant line 


transeene orientation 


Southern 


Northern 


Enzvme 


TGA 


1 


control 


na 


+ 


blank 


100 


104 


1 


OMT 


sense 


+ 


2.9E+6 


86 


55 


2 


OMT 


sense 


+ 


na 


162 


58 


^ 

j 


OMT 


sense 


+ 


4.1E+6 


na 


63 


4 


OMT 


sense 


+ 


2.3E+6 


142 


66 


5 


OMT 


sense 


+ 


3.6E+5 


199 


75 


1 


OMT 


anti-sense 


+ 


1.6E+4 


189 


66 


2 


OMT 


anti-sense 


+ 


5.7E+3 


35 


70 


3 


OMT 


anti-sense 


+ 


8.0E+3 


105 


73 


4 


OMT 


anti-sense 


+ 


1.4E+4 


109 


74 


5 


OMT 


anti-sense 


+ 


2.5E+4 


87 


78 


6 


OMT 


anti-sense 


+ 


2.5E+4 


58 


84 


7 


OMT 


anti-sense 


+ 


2.5E+4 


97 


92 


8 


OMT 


anti-sense 


+ 


1.1E+4 


151 


94 



20 

These data clearly indicate that lignin concentration, as measured by the TGA 
assay, can be directly manipulated by either sense or anti-sense expression of a lignin 
biosynthetic gene such as OMT. 

25 Example 4 

Use of a 4-Coumarate:CoA ligase (4CU) Gene to Modify Lignin Biosynthesis 

a) Transformation of tobacco plants with a Pinus radiata 4CL gene 
30 Sense and anti-sense constructs containing a sequence including the coding 

region of 4CL (SEQ ID NO: 56) from Pinus radiata were inserted into Agrobacterium 
tumefaciens LBA4301 by direct transformation as described above. The presence and 
integrity of the transgenic constructs were verified by restriction digestion and DNA 
sequencing. 

35 Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed as 

described above. Five independent transformed plant lines were established for the 
sense construct and eight independent transformed plant lines were established for the 
anti-sense construct for 4CL. Transformed plants containing the appropriate lignin 
gene construct were verified using Southern blot experiments. A in the column 
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labeled "Southern" in Table 2 indicates that the transformed plant lines listed were 
confirmed as independent transformed lines. 

b^ Expression of Pinus 4CL in transformed plants 

5 Total RNA was isolated from each independent transformed plant line created 

with the 4CL sense and anti-sense constructs. The RNA samples were analysed in 
Northern blot experiments to determine the level of expression of the transgene in each 
transformed line. The data shown in the column labelled "Northern" in Table 2 below 
shows that the transformed plant lines containing the sense and anti-sense constructs for 

10 4CL all exhibit high levels of expression, relative to the background on the Northern 
blots. 4CL expression in anti-sense plant line number 1 was not measured because the 
RNA was not available at the time of the experiment. There was no detectable 
hybridisation to RNA samples from empty vector-transformed control plants. 

15 Modulation of 4CL enzvme activity in transformed plants 

The total activity of 4CL enzyme, encoded by the Pinus 4CL gene and by the 
endogenous tobacco 4CL gene, in transformed tobacco plants was analysed for each 
transformed plant line created with the 4CL sense and anti-sense constructs. Crude 
protein extracts were prepared from each transformed plant and assayed using the 

20 method of Zhang et al. f Plant Phvsiol .. U3:65-74, 1997). The data contained in the 
column labeled "Enzyme" in Table 2 shows that the transformed plant lines containing 
the 4CL sense construct had elevated 4CL enzyme activity, with a maximum of 258%, 
and the transformed plant lines containing the 4CL anti-sense construct had reduced 
4CL enzyme activity, with a minimum of 59%, relative to empty vector-transformed 

25 control plants. 

d^ Effects of Pinus 4CL on lienin concentration in transformed plants 

The concentration of lignin in samples of transformed plant material was 
determined as described in Example 3. The data shown in the column labelled "TGA" 
30 in Table 2 shows that the transformed plant lines containing the sense and the anti- 
sense 4CL gene constructs all exhibited significantly decreased levels of lignin, relative 
to the empty vector-transformed control plant lines. These data clearly indicate that 

20 
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lignin concentration, as measured by the TGA assay, can be directly manipulated by 
either sense or anti-sense expression of a lignin biosynthetic gene such as 4CL. 



5 Table 2 





Dlant line 


transeene 


orientation Southern 


Northern 


Enzvme 


TGA 




] 


control 


na 




blank 


100 


92 


10 


2 


control 


na 


+ 


blank 


100 


104 




1 


4CL 


sense 


+ 


2.3E+4 


169 


64 




2 


4CL 


sense 




4.5E+4 


258 


73 




3 


4CL 




+ 


3.1E+4 


174 


77 




4 




sense 


+ 






OA 


15 


5 


4CL 


sense 




1.6E+4 


184 


92 




1 


4CL 


anti-sense 


+ 


na 


59 


75 




2 


4CL 


anti-sense 


4- 


l.GE+4 


7A 

/ V 


75 




3 


4CL 


anti-sense 




9.6E+3 


81 


80 




4 


4CL 


anti-sense 




1.2E+4 


90 


83 


20 


5 


4CL 


anti-sense 




4.7E+3 


101 


88 




6 


4CL 


anti-sense 


+ 


3.9E+3 


116 


89 




7 


4CL 


anti-sense 


+ 


1.8E+3 


125 


94 




8 


4CL 


anti-sense 


+ 


1.7E+4 


106 


97 


25. 

















Example 5 

Transformation of Tobacco using the Inventive Lignin Biosvnthetic Genes 

30 

Sense and anti-sense constructs containing sequences including the coding 
regions of C3H (SEQ ID NO: 18), F5H (SEQ ID NO: 19), CCR (SEQ ID NO: 25) and 
CGT (SEQ ID NO: 31) from Eucalyptus grandis, and PAL (SEQ ID NO: 45 and 47), 
C4H (SEQ ID NO: 48 and 49), PNL (SEQ ID NO: 81) and LAC (SEQ ID NO: 83) 

35 from Pinus radiata were inserted into Agrobacterium tumefaciens LBA4301 by direct 
transformation as described above. The presence and integrity of the transgenic 
constructs were verified by restriction digestion and DNA sequencing. 

Tobacco {Nicotiana tabacum cv. Samsun) leaf sections were transformed as 
described in Example 3. Up to twelve independent transformed plant lines were 

40 established for each sense construct and each anti-sense construct listed in the 
preceding paragraph. Transformed plants containing the appropriate lignin gene 
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construct were verified using Southern blot experiments. All of the transformed plant 
lines analysed were confirmed as independent transformed lines. 

Example 6 

Manipulation of Lienin Content in Transfor med Plants 
a) Determination of transe e ne expression bv Northern hint experiments 

Total RNA was isolated from each independent transformed plant line described in 
10 Example 5. The RNA samples were analysed in Northern blot experiments to 
determine the level of expression of the transgene in each transformed line. The 
column labelled "Northern" in Table 3 shows the level of transgene expression for all 
plant lines assayed, relative to the background on the Northern blots. There was no 
detectable hybridisation to RNA samples from empty vector-transformed control 
is plants. 

hi Determination of lienin concentration in tra nsformed plants 

The concentration of lignin in empty vector-transformed control plant lines and in 
up to twelve independent transformed lines for each sense construct and each anti-sense 
20 construct described in Example 5 was determined as described in Example 3. The 
column labelled "TGA" in Table 3 shows the thioglycolic acid extractable lignins for 
all plant lines assayed, expressed as the average percentage of TGA extractable lignins 
in transformed plants versus control plants. The range of variation is shown in 
parentheses. 
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Table 3 



transeene 


orientation 


no. of lines 


Northern 


TGA 


control 


na 


3 


blank 


100 (92-104) 


C3H 


sense 


5 


3.7E+4 


74 (67-85) 


F5H 


sense 


10 


5.8E+4 


70 (63-79) 


F5H 


anti-sense 


9 


5.8E+4 


73 (35-93) 


CCR 


sense 


1 


na 


74 


CCR 


anti-sense 


2 


na 


74 (62-86) 


PAL 


sense 


5 


1.9E+5 


77 (71-86) 


PAL 


anti-sense 


4 


1.5E+4 


62 (37-77) 


C4H 


anti-sense 


10 


5.8E+4 


86 (52-113) 


PNL 


anti-sense 


6 


1.2E+4 


88 (70-114) 


LAC 


sense 


5 


1.7E+5 


na 


LAC 


anti-sense 


12 


1.7E+5 


88 (73-114) 



Transformed plant lines containing the sense and the anti-sense lignin 
20 biosynthetic gene constructs all exhibited significantly decreased levels of lignin, 
relative to the empty vector-transformed control plant lines. The most dramatic effects 
on lignin concentration were seen in the F5H anti-sense plants with as little as 35% of 
the amount of lignin in control plants, and in the PAL anti-sense plants with as little as 
37% of the amount of lignin in control plants. These data clearly indicate that lignin 
25 concentration, as measured by the TGA assay, can be directly manipulated by 
conventional anti-sense methodology and also by sense over-expression using the 
inventive lignin biosynthetic genes. 

Example 7 

30 

Modulation of Lignin Enzyme Activity in Transformed Plants 

The activities and substrate specificities of selected lignin biosynthetic enzymes 
were assayed in crude extracts from transformed tobacco plants containing sense and 
35 anti-sense constructs for PAL (SEQ ID NO: 45), PNL (SEQ ID NO: 81) and LAC 
(SEQ ID NO: 83) from Pinus radiata, and CGT (SEQ ID NO: 31) from Eucalyptus 
granclis. 

Enzyme assays were performed using published methods for PAL (Southerton, 
S.G. and Deverall, B.J., Plant Path . 39:223-230, 1990), CGT (Vellekoop, P. et al., 
40 FEBS, 330:36-40, 1993), PNL (Espin, CJ. et al., Phvtochemistrv . 44: 17-22, 1997) and 

23 
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10 



15 



20 



LAC (Bao, W. et al., Science, 260:672-674, 1993). The data shown in the column 
labelled "Enzyme" in Table 4 shows the average enzyme activity from replicate 
measures for all plant lines assayed, expressed as a percent of enzyme activity in empty 
vector-transformed control plants. The range of variation is shown in parentheses. 



transeene 



orientation 



Table 4 
no. of lines 



Enzvme 



control 

PAL 

PAL 

CGT 

PNL 

LAC 

LAC 



na 

sense 

anti-sense 

anti-sense 

anti-sense 

sense 

anti-sense 



3 
5 
3 
1 
6 
5 
11 



100 
87 (60-124) 
53 (38-80) 
89 

144 (41-279) 
78 (16-240) 
64 (14-106) 



25 



All of the transformed plant lines, except the PNL anti-sense transformed plant 
lines, showed average lignin enzyme activities which were significantly lower than the 
activities observed in empty vector-transformed control plants. The most dramatic 
effects on lignin enzyme activities were seen in the PAL anti-sense transformed plant 
lines in which all of the lines showed reduced PAL activity and in the LAC anti-sense 
transformed plant lines which showed as little as 14% of the LAC activity in empty 
vector-transformed control plant lines. 



30 



35 



40 



Example 8 

F.mrtinnal Identification "f Lignin Biosvnthetic Genes 
Sense constructs containing sequences including the coding regions for PAL 
(SEQ ID NO: 47), OMT (SEQ ID NO: 53), 4CL (SEQ ID NO: 56 and 57) and POX 
(SEQ ID NO: 86) from Pima radiata, and OMT (SEQ ID NO: 23 and 24), CCR (SEQ 
ID NO: 26-28), CGT (SEQ ID NO: 31 and 33) and POX (SEQ ID NO: 42 and 44) from 
Eucalyptus grandis were inserted into the commercially available protein expression 
vector, pProEX-1 (Gibco BRL). The resultant constructs were transformed into E. coli 
XLl-Blue (Stratagene), which were then induced to produce recombinant protein by the 
addition of IPTG. Purified proteins were produced for the Pinus OMT and 4CL 
constructs and the Eucalyptus OMT and POX constructs using Ni column 

24 
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chromatography (Janknecht, R. et al., Proc. Natl. Acad. Sci .. 88:8972-8976, 1991). 
Enzyme assays for each of the purified proteins conclusively demonstrated the expected 
substrate specificity and enzymatic activity for the genes tested. 

The data for two representative enzyme assay experiments, demonstrating the 

5 verification of the enzymatic activity of a Pinus radiata 4CL gene (SEQ ID NO: 56) 
and a Pinus radiata OMT gene (SEQ ID NO: 53), are shown in Table 5. For the 4CL 
enzyme, one unit equals the quantity of protein required to convert the substrate into 
product at the rate of 0, 1 absorbance units per minute. For the OMT enzyme, one unit 
equals the quantity of protein required to convert 1 pmole of substrate to product per 

10 minute. 

Table 5 





purification 


total ml 


total mg 


total units 


% yield 


fold 


transeene 


step 


extract 


Drotein 


activity 


activity 


purification 


4CL 


crude 


10 ml 


51 mg 


4200 


100 


1 




Ni column 


4 ml 


0.84 mg 


3680 


88 


53 


OMT 


crude 


10 ml 


74 mg 


4600 


100 


1 




Ni column 


4 ml 


1.2 mg 


4487 


98 


60 



25 The data shown in Table 5 indicate that both the purified 4CL enzyme and the 

purified OMT enzyme show high activity in enzyme assays, confirming the 
identification of the 4CL and OMT genes described in this application. Crude protein 
preparations from £. coli transformed with empty vector show no activity in either the 
4CL or the OMT enzyme assay. 

30 Although the present invention has been described in some detail by way of 

illustration and example for purposes of clarity of understanding, changes and 
modifications can be carried out without departing from the scope of the invention 
which is intended to be limited only by the scope of the appended claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: Genesis Research and Development Corp, Ltd. 

(ii) TITLE OF THE INVENTION: MATERIALS AND METHODS FOR 
THE MODIFICATION OF PLANT LIGNIN CONTENT 

Ciii) NUMBER OF SEQUENCES: 88 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Russell McVeagh West-Walker 

(B) STREET: The Todd Building, Cnr Brandon Street & 

Lambton Quay 

(C) CITY: Wellington 

(D) STATE: 

(E) COUNTRY: New Zealand 

(F) ZIP: 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: Wordperfect 5.1 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vil) PRIOR APPLICATION DATA: 
(A) APPLICATION NUMBER: 
<B) FILING DATE: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Bennett, Michael Roy 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 22315\MRB 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +64 4 495 7740 

(B) TELEFAX: +64 4 499 9306 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

CTTCGCGCTA CCGCATACTC CACCACCGCG 

AGGATTGCCT CGGTTGGGCT GCCCGGGACC 
120 

CCCGCAGGCC GTGGGAAGCG AAGACGTCTC 
180 

CGCAGATGTG GCTTGGACTA GGAATGTGCA 
240 



SEQ ID NO: 1 : 

TGCAGAAGAT GAGCTCGGAG GGTGGGAAGG 
CTTCTGGGTT CCTCTCCCCN TACAAATTCA 
GATTAAGATC ACGCACTGTG GAGTGTGCTA 
GGGACACTCC AAGTATCCTC TGGTGCCGGG 
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GCACGAGATA GTTGGAATTG TGAAACAGGT TGGCTCCAGT GTCCAACGCT TCAAAGTTGG 
300 

CGATCATGTG GGGGTGGGAA CTTATGTCAA TTCATGCAGA GAGTGCGAGT ATTGCAATGA 
360 

CAGGCTAGAA GTCCAATGTG AAAAGTCGGT TATGACTTTT GATGGAATTG ATGCAGATGG 
420 

TACAGTGACA AAGGGAGGAT ATTCTAGTCA CATTGTCGTC CATGAAAGGT ATTGCGTCAG 
480 

GATTCCAGAA AACTACCCGA TGGATCTAGC AGCGCATTGC TCTGTGCTGG ATCAC 
535 

(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

GCGCCTGCAG GTCGACACTA GTGGATCCAA AGAATTCGGC ACGAGGTTGC AGGTCGGGGA 

. 60 

TGATTTGAAT CACAGAAACC TCAGCGATTT TGCCAAGAAA TATGGCAAAA TCTTTCTGCT 
120 

CAAGATGGGC CAGAGGAATC TTGTGGTAGT TTCATCTCCC GATCTCGCCA AGGAGGTCCT 
180 

GCACACCCAG GGCGTCGAGT TTGGGTCTCG AACCCGGAAC GTGGTGTTCG ATATCTTCAC 
240 

GGGCAAGGGG CAGGACATGG TGTTCACCGT CTATGGAGAT CACTGGAGAA AGATGCGCAG 
300 

GATCATGACT GTGCCTTTCT TTACGAATAA AGTTGTCCAG CACTACAGAT TCGCGTGGGA 
360 

AGACGAGATC AGCCGCGTGG TCGCGGATGT GAAATCCCGC GCCGAGTCTT CCACCTCGGG 
420 

CATTGTCATC CGTAGCGCCT CCAGCTCATG ATGTATAATA TTATGTATAG GATGATGTTC 
480 

GACAGGAGAT TCGAATCCGA GGACGACCCG CTTTTCCTCA AGCTCAAGGC CCTCAACGGA 
540 

GAGCGAAGTC GATTGGCCCA GAGCTTTGAG TACAATTATG GGGATTTCAT TCCCAGTCTT 
600 

AGGCCCTTCC TCAGAGGTTA TCACAGAATC TGCAATGAGA TTAAAGAGAA ACGGCTCTCT 
660 

CTTTTCAAGG A 
671 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 940 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: 

CTTCAGGACA AGGGAGAGAT CAATGAGGAT 
60 

GTTGCAGCAA TTGAGACAAC GCTGTGGTCG 
120 

CACCAGGACA TTCAGAGCAA GGTGCGCGCA 
180 

CAGATAACGG AACCAGACAC GACAAGGTTG 
240 



SEQ ID NO: 3: 

AATGTTTTGT ACATCGTTGA GAACATCAAC 
ATGGAATGGG GAATAGCGGA GCTGGTGAAC 
GAGCTGGACG CTGTTCTTGG ACCAGGCGTG 
CCCTACCTTC AGGCGGTTGT GAAGGAAACC 
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CTTCGTCTCC GCATGGCGAT CCCGTTGCTC GTCCCCCACA TGAATCTCCA CGACGCCAAG 
300 

CTCGGGGGCT ACGATATTCC GGCAGAGAGC AAGATCCTGG TGAACGCCTG GTGGTTGGCC 
360 

AACAACCCCG CCAACTGGAA GAACCCCGAG GAGTTCCGCC CCGAGCGGTT CTTCGAGGAG 
420 

GAGAAGCACA CCGAAGCCAA TGGCAACGAC TTCAAATTCC TGNCCTTCGG TGTGGGGAGG 
480 

AGGAGCTGCC CGGGAATCAT TCTGGCGCTG CTCTCCTCGC ACTCTCCATC GGAAGACTTG 
540 

TTCAGAACTT CCACCTTCTG CCGCCGCCCG GGCAGAGCAA AGTGGATGTC ACTGAGAAGG 
600 

GCGGGCAATT CAGCCTTCAC ATTCTCAACC ATTCTCTCAT CGTCGCCAAG CCCATAGCTT 
660 

CTGCTTAATC CCAACTTGTC AGTGACTGGT ATATAAATGC GCGCACCTGA ACAAAAAACA 
720 

CTCCATCTAT CATGACTGTG TGTGCGTGTC CACTGTCGAG TCTACTAAGA GCTCATAGCA 
780 

CTTCAAAAGT TTGCTAGGAT TTCAATAACA GACACCGTCA ATTATGTCAT GTTTCAATAA 
840 

AAGTTTGCAT AAATTAAATG ATATTTCAAT ATACTATTTT GACTCTCCAC CAATTGGGGA 
900 

ATTTTACTGC TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
940 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

NNGCTCNACC GACGGTGGAC GGTCCGCTAC TCAGTAACTG AGTGGGATCC CCCGGGCTGA 
60 

CAGGCAATTC GATTTAGCTC ACTCATTAGG CACCCCAGGC TTTACACTTT ATGCTTCCGG 
120 

CTCGTATGTT GTGTGGAATT GTGAGCGGAT AACAATTTCA CACAGGAAAC AGCTATGACC 
180 

ATGATTACGC CAAGCGCGCA ATTAACCCTC ACTAAAGGGA ACAAAAGCTG GAGCTCCACC 
240 

GCGGTGGCGG CCGCTCTAGA ACTAGTGGAT CCAAAGAATT CGGCACGAGA CCCAGTGACC 
300 

TTCAGGCCTG AGAGATTTCT TGAGGAAGAT GTTGATATTA AGGGCCATGA TTACAGGCTA 
360 

CTGCCATTGG TGCAGGGCGC AGGATCTGCC CTGGTGCACA ATTGGGTATT AATTTAGTTC 
420 

AGTCTATGTT GGGACACCTG CTTCATCATT TCGTATGGGC ACCTCCTGAG GGAATGAAGG 
480 

CAGAAGACAT AGATCTCACA GAGAATCCAG GGCTTGTTAC TTTCATGGCC AAGCCTGTGC 
540 

AGGCCATTGC TATTCCTCGA TTGCCTGATC ATCTCTACAA GCGACAGCCA CTCAATTGAT 
600 

CAATTGATCT GATAGTAAGT TTGAATTTTG TTTTGATACA AAACGAAATA ACGTGCAGTT 
660 

TCTCCTTTTC CATAGTCAAC ATGCAGCTTT CTTTCTCTGA AGCGCATGCA GCTTTCTTTC 
720 

TCTGAAGCCC AACTTCTAGC AAGCAATAAC TGTATATTTT AGAACAAATA CCTATTCCTC 
780 

AAATTGAGWA TTTCTCTGTA GGGGNNGNTA ATTGTGCAAT TTGCAAGNAA TAGTAAAGTT 
840 

TANTTTAGGG NATTTTAATA GTCCTANGTA ANANGNGGNA ATGNTAGNGG GCATTNAGAA 
900 
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ANCCCTAATA GNTGTTGGNG GNNGNTAGGN TTTTTNACCA AAAAAAAAA 
94 9 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GAATTCGGCA CGAGAAAGCC CTAGAATTTT TTCAGCATGC TATCACAGCC CCAGCGACAA 
60 

CTTTAACTGC AATAACTGTG GAAGCGTACA AAAAGTTTGT CCTAGTTTCT CTCATTCAGA 
120 

CTGGTCAGGT TCCAGCATTT CCAAAATACA CACCTGCTGT TGTCCAAAGA AATTTGAAAT 
180 

CTTGCACTCA GCCCTACATT GATTTAGCAA ACAACTACAG TAGTGGGAAA ATTTCTGTAT 
240 

TGGAAGCTTG TGTCAACACG AACACAGAGA AGTTCAAGAA TGATAGTAAT TTGGGGTTAG 
300 

TCAAGCAAGT TTTGTCATCT CTTTATAAAC GGAATATTCA GAGATTGACA CAGACATATC 
360 

TGACCCTCTC TCTTCAAGAC AT AGCAAG T A CGGTACAGTT GGAGACTGCT AAGCAGGCTG 
420 

AACTCCATGT TCTGCAGATG ATTCAAGATG GTGAGATTTT TGCAACCATA AATCAGAAAG 
480 

ATGGGATGGT GAGCTTCAAT GAGGATCCTG AACAGTACAA AACATGTCAG ATGACTGAAT 
540 

ATATAGATAC TGCAATTCGG AGAATCATGG CACTATCAAA GAAGCTCACC ACAGTAGATG 
600 

AGCAGATTTC GTGTGATCAT TCCTACCTGA GTAAGGTGGG GAGAGAGCGT TCAAGATTTG 
660 

ACATAGATGA TTTTGATACT GTTCCCCAGA AGTTCANAAA TATGTAACAA ATGATGTAAA 
720 

TCATCTTCAA GACTCGCTTA TATTCATTAC TTTCTATGTG AATTGATAGT CTGTTAACAA 
780 

TAGTACTGTG GCTGAGTCCA GAAAGGATCT CTCGGTATTA TCACTTGACA TGCCATCAAA 
840 

AAAATCTCAA ATTTCTCGAT GTCTAGTCTT GATTTTGATT ATGAATGCGA CTTTTAGTTG 
900 

TGACATTTGA GCACCTCGAG TGAACTACAA AGTTGCATGT T AAAAAAAAA AAAAAAAAA 
959 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1026 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

GAATTCGGCA CGAGCTTTGA GGCAACCTAC 
60 

CAAACAGGTT TAAGGAAATG GCAGGCACAA 
120 

CAACCCAAGC AGAGGAGCCG GTTAAGGTTG 
180 

TTTTGCAGAG CGATGCCCTC TATCAGTATA 
240 



SEQ ID NO: 6: 

ATTCATTGAA TCCCAGGATT TCTTCTTGTC 
GTGTTGCTGC AGCAGAGGTG AAGGCTCAGA 
TCCGCCATCA AGAAGTGGGA CACAAAAGTC 
TATTGGAAAC GAGCGTGTAC CCTCGTGAGC 
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CCGAGCCAAT GAAGGAGCTC CGCGAAGTGA CTGCCAAGCA TCCCTGGAAC CTCATGACTA 

3 CTTCTGCCGA TGAGGGTCAA TTTCTGGGCC TCCTGCTGAA GCTCATTAAC GCCAAGAACA 

3 CCATGGAGAT TGGGGTGTAC ACTGGTTACT CGCTTCTCAG CACAGCCCTT GCATTGCCCG 
420 

ATGATGGAAA GATTCTAGCC ATGGACATCA ACAGAGAGAA CTATGATATC GGATTGCCTA 
480 

TTATTGAGAA AGCAGGAGTT GCCCACAAGA TTGACTTCAG AGAGGGCCCT GCTCTGCCAG 

5 TTCTGGACGA ACTGCTTAAG AATGAGGACA TGCATGGATC GTTCGATTTT GTGTTCGTGG 

6 ATGCGGACAA AGACAACTAT CTAAACTACC ACAAGCGTCT GATCGATCTG GTGAAGGTTG 

GAGGTCTGAT TGCATATGAC AACACCCTGT GGAACGGATC TGTGGTGGCT CCACCCGATG 
720 

CTCCCCTGAG GAAATATGTG AGATATTACA GAGATTTCGT GATGGAGCTA AACAAGGCCC 
780 

TTGCTGTCGA TCCCCGCATT GAGATCAGCC AAATCCCAGT CGGTGACGGC GTCACCCTTT 
8 GCAGGCGTGT CTATTGAAAA CAATCCTTGT TTCTGCTCGT CTATTGCAAG CATAAAGGCT 

9 CTCTGATTAT AAGGAGAACG CTATAATATA TGGGGTTGAA GCCATTTGTT TTGTTTAGTG 

960 

TATTGATAAT AAAGTAGTAC AGCATATGCA AAGTTTGTAT CAAAAAAAAA AAAAAAAAAA 
1020 

AAAAAA 
1026 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 14 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GAATTCGGCA CGAGGCCAAC TGCAAGCAAT ACAGTACAAG AGCCAGACGA TCGAATCCTG 

6 TGAAGTGGTT CTGAAGTGAT GGGAAGCTTG GAATCTGAAA AAACTGTTAC AGGATATGCA 
120 

GCTCGGGACT CCAGTGGCCA CTTGTCCCCT TACACTTACA ATCTCAGAAA GAAAGGACCT 
1 GAGGATGTAA TTGTAAAGGT CATTTACTGC GGAATCTGCC ACTCTGATTT AGTTCAAATG 
2 CGTAATGAAA TGGACATGTC TCATTACCCA ATGGTCCCTG GGCATGAAGT GGTGGGGATT 

3 GTAACAGAGA TTGGCAGCGA GGTGAAGAAA TTCAAAGTGG GAGAGCATGT AGGGGTTGGT 

360 

TGCATTGTTG GGTCCTGTCG CAGTTGCGGT AATTGCAATC AGAGCATGGA ACAATACTGC 

4 AGCAAGAGGA TTTGGACCTA CAATGATGTG AACCATGACG GCACACCTAC TCAGGGCGGA 

4 TTTGCAAGCA GTATGGTGGT TGATCAGATG TWTGTGGTTC GAATCCCGGA GAATCTTCCT 

5 CTGGAACAAG CGGCCCCTCT GTTATGTGCA GGGGTTACAG TTTTCAGCCC AATGAAGCAT 

6 TTCGCCATGA CAGAGCCCGG GAAGAAATGT GGGATTTTGG GTTTAGGAGG CGTGGGGCAC 
660 

ATGGGTGTCA AGATTGCCAA AGCCTTTGGA CTCCACGTGA CGGTTATCAG TTCGTCTGAT 
720 

AAAAAGAAAG AAGAAGCCAT GGAAGTCCTC GGCGCCGATG CTTATCTTGT TAGCAAGGAT 
780 
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ACTGAAAAGA TGATGGAAGC AGCAGAGAGC CTAGATTACA TAATGGACAC CATTCCAGTT 
840 

GCTCATCCTC TGGAACCATA TCTTGCCCTT CTGAAGACAA ATGGAAAGCT AGTGATGCTG 
900 

GGCGTTGTTC CAGAGTCGTT GCACTTCGTG ACTCCTCTCT TAATACTTGG G AG AAGGAG C 
960 

ATAGCTGGAA GTTTCATTGG CAGCATGGAG GAAACACAGG AAACTCTAGA TTTCTGTGCA 
1020 

GAGAAGAAGG TATCATCGAT GATTGAGGTT GTGGGCCTGG ACTACATCAA CACGGCCATG 
1080 

GAAAGGTTGG AGAAGAACGA TGTCCGTTAC AGATTTGTGG TGGATGTTGC TAGAAGCAAG 
1140 

TTGGATAATT AGTCTGCAAT CAATCAATCA GATCAATGCC TGCATGCAAG ATGAATAGAT 
1200 

CTGGACTAGT AGCTTAACAT GAAAGGGAAA TTAAATTTTT ATTTAGGAAC TCGATACTGG 
1260 

TTTTTGTTAC TTTAGTTTAG CTTTTGTGAG GTTGAAACAA TTCAGATGTT TTTTTAACTT 
1320 

GTATATGTAA AGATCAATTT CTCGTGACAG TAAATAATAA TCCAATGTCT TCTGCCAAAT 
1380 

TAATATATGT ATTCGTATTT TTATATGAAA AAAAAAAAAA AAAA 
14 54AAAAAA AAAAAAAAAA 1440 AAAAAAAAAA AAAA 

1454 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 740 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAATTCGGCA CGAGACCATT TCCAGCTAAT ATTGGCATAG CAATTGGTCA TTCTATCTTT 
60 

GTCAAAGGAG ATCAAACAAA TTTTGAAATT GGACCTAATG GTGTGGAGGC TAGTCAGCTA 
120 

TACCCAGATG TGAAATATAC CACTGTCGAT GAGTACCTCA GCAAATTTGT GTGAAGTATG 
180 

CGAGATTCTC TTCCACATGC TTCAGAGATA CATAACAGTT TCAATCAATG TTTGTCCTAG 
240 

GCATTTGCCA AATTGTGGGT TATAATCCTT CGTAGGTGTT TGGCAGAACA GAACCTCCTG 
300 

TT TAG TAT AG TATGACGAGC TAGGCACTGC AGATCCTTCA CACTTTTCTC TTCCATAAGA 
360 

AACAAATACT CACCTGTGGT TTGTTTTCTT TCTTTCTGGA ACTTTGGTAT GGCAATAATG 
420 

TCTTTGGAAA CCGCTTAGTG TGGAATGCTA AGTACTAGTG TCCAGAGTTC TAAGGGAGTT 
480 

CCAAAATCAT CGCTGATGTG AACTGGTTGT TCCAGAGGGT GTTTACAACC AACAGTTGTT 
540 

CAGTGAATAA TTTTGTTAGA GTGTTTAGAT CCATCTTTAC AAGGCTATTG AGTAAGGTTG 
600 

GTGTTAGTGA ACGGAATGAT GTCAAATCTT GATGGGCTGA CTGACTCTCT TGTGATGTCA 
660 

AATCTTGATG GATTGTGTCT TTTTCAATGG TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
740 720 AAAAAAAAAA AAAAAAAAAA 



740 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 624 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GAATTCCTGC AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTGGAGCTC 
60 

GCGCGCCTGC AGGTCGACAC TAGTGGATCC AAAGAATTCG GCACGAGGCC CGACGGCCAC 
120 

TTGTTGGACG CCATGGAAGC TCTCCGGAAA GCCGGGATTC TGGAACCGTT TAAACTGCAG 

^ CCC AAGGAAG GACTGGCTCT CGTCAACGGC ACAGCGGTGG GATCCGCCGT GGCCGCGTCC 

2 GTCTGTGTTG ACGCCAACGT GCTGGGCGTG CTGGCTGAGA TTCTGTCTGC GCTCTTCTGC 

3 GAGGTGATGC AAGGGAAACC GGAGTTCGTA GATCCGTTAA CCCACCAGTT GAAGCACCAC 

3 CCAGGGCAGA TCGAAGCCGC GGCCGTCATG GAGTTCCTCC TCGACGGTAG CGACTACGTG 
4 20 

AAAGAAGCAG CGCGGCTTCA CGAGAAAGAC CCGTTGAGCA AACCGAAACA AGACCGCTAC 

4 GCTCTGCGAA CATCGCCACA GTGGTTGGGG CCTCCGATCG AAGTCATCCG CGCTGCYACT 

5 CACTCCATCG AGCGGGAGAT CAATTCCGTC AACGACAATC CGTTAATCGA TGTCTCCAGG 
600 

GACATGGCTG TCCACGGCGG CAAC 
624 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GAATTCCTGC AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTGGAGCTC 

6 CAGTACCTGG CCAACCCCGT CACGACTCAC GTCCAGAGCG CCGAACAACA CAACCAGGAT 

1 GTCAATTCCC TCGGCTTGAT CTCCGCCAGA AAGACTGCCG AGGCCGTTGA GATTTTAAAG 

"tGATGTTCG CTACATATCT GGTGGCCTTA TGCCAGGCGA TCGATCTCCG GCACCTGGAA 
240 

GAAAACATGC GATCCGTTGT GAAGCACGTA GTCTTGCA 
278 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 765 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GAGCTCCTGC AAGTCATCGA TCATCAGCCC GTTTTCTCGT ACATCGACGA TCCCACAAAT 

^ C CAT CAT AC G CGCTTATGCT CCAACTCAGA GAAGTGCTCG TAGATGAGGC TCTCAAATCA 
120 
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TCTTGCCCAG ACGGGAATGA CGAATCCGAT CACAATTTGC AGCCCGCTGA GAGCGCTGGA 
180 

GCTGCTGGAA TATTACCCAA TTGGGTGTTT AGCAGGATCC CCATATTTCA AGAGGAGTTG 
240 

AAGGCCCGTT TAGAGGAAGA GGTTCCGAAG GCGAGGGAAC GATTCGATAA TGGGGACTTC 
300 

CCAATTGCAA ACAGAATAAA CAAGTGCAGG ACATATCCCA TTTACAGATT CGTGAGATCA 
360 

GAGTTGGGAA CCGATTTGCT AACAGGGCCC AAGTGGAGAA GCCCCGGCGA AGATATAGAA 
420 

AAGGTATTTG AGGGCATTTG CCAAGGGAAA ATTGGAAACG TGATCCTCAA ATGTCTGGAC 
480 

GCTTGGGGTG GGTGCGCTGG ACCATTCACT CCACGTGCAT ATCCTGCGTC TCCTGCAGCG 
540 

TTCAATGCCT CATATTGGGC ATGGTTTGAT AGCACCAAAT CACCCTCTGC AACGAGCGGC 
600 

AGAGGTTTCT GGAGCGCCCA ACAACAACAA GTTCTTTGAT TTAACTGACT CTTAAGCATT 
660 

CCTAAACAGC TTGTTCTTCG CAATAACGAA TCTTTCATCT TCGTTACTTT GTAAAAGATG 
720 

GGGTTCCAAC AAAATAGAAG AAATATTTTC GATCCAAAAA AAAAA 
765 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TGATTATGCG GATCCTTGGG CAGGGATACG GCATGACAGA AGCAGGCCCG GTGCTGGCAA 
60 

TGAACCTAGC CTTCGCAAAG AATCCTTTCC CCGCCAAATC TGGCTCCTGC GGAACAGTCG 
120 

TCCGGAACGC TCAAATAAAG ATCCTCGATT ACAGGAACTG GCGAGTCTCT CCCGCACAAT 
180 

CAAGCCGGCG AAATCTGCAT CCGCGGACCC GAAATAATGA AAGGATATAT TAACGACCCG 
240 

GAATCCACGG CCGCTACAAT CGATGAAGAA GGCTGGCTCC ACACAGGCGA CGTCGGGTAC 
300 

ATTGACGATG ACGAAGAAAT CTTCATAGTC G AC AG AG T AA AGGAGATTAT CAATATAAAG 
360 

GCTTCCAGGT GGATCCTGCT AATCGAATTC CTGCAGCCCG GGGGTCCACT AGTTCTAGAG 
420 

CGGCCGCCAC CGCGGTGGAG CTCCAGCTTT TGT 
453 



(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

TCTTCGAATT CTCTTTCACG ACTGCTTCGT 
60 

TGATAACTCA ACGTTCACCG GAGAAAAGAC 
120 



SEQ ID NO: 13: 

TAATGGCTGC GATGGCTCGA TATTGTTAGA 
TGCAGGCCCA AATGTTAATT CTGCGAGAGG 
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ATTCGACGTA AT AG AC AC C A TCAAAACTCA AGTTGAGGCA GCCTGCAGTG GTGTCGTGTC 
180 

AGTTGCCGAC ATTCTCGCCA TTGCTGCACG CGATTCAGTC GTCCAACTGG GGGGCCCAAC 
240 

ATGGACGGTA CTTCTGGGAG AAAAGACGGA TCCGATCA 
278 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CTTCGAATTC WYTTYCAYGA YTG 
23 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
{ B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GATCGGATCC RTCYYKYCTY CC 
22 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AATTCGGCAC GAGACGACCT CTTGTATCGG ACCCGGATCC GCTATCGTTA ACGTACACAC 

6 GTTCTAGTGC TGAATGGAGA TGGAGAGCAC CACCGGCACC GGCAACGGCC TTCACAGCCT 
120 

CTGCGCCGCC GGGAGCCACC ATGCCGACCC ACTGAACTGG GGGGCGGCGG CAGCAGCCCT 
180 

CACAGGGAGC CACCTCGACG AGGTGAAGCG GATGGTCGAG GAGTACCGGA GGCCGGCGGT 

2 GCGCCTCGGC GGGGAGTCCC TCACGATAGC CCAGGTGGCG GCGGTGGCGA GTCAGGAGGG 

3 GGTAGGGGTC GAGCTCTCGG AGGCGGCCCG TCCCAGGGTC AAGGCCAGCA GCGACTGGGT 

3 CATGGAGAGC ATGAACAAGG GAACTGACAG CTACGGGGTC ACCACCGGGT TCGGCGGCAA 
420 

CTTCTCAAAC CGGAGGCCGA AGCAAGGCGG TCCTTTTCAG AAGGAACTTA TA 
472 

(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CCAAAGCTCC TAGTGCCTCA TGAGTCTGCT GAGGATTGCA CAATTGGCGG GTTCGACGTG 
60 

CCCCGAGGCP, CCATGATCCT GGTTAATGCG TGGGCAATTC AAAGAGACCC AAAAGTGTGG 
120 

GACGATCCCA CAAATTTTAA ACCGGAGAGG TACGAGGGAT TGGAAGGTGA TCATGCCTAC 
180 

CGACTATTGC CGTTTGGGAT GGGGAGGAGA AGTTGTCCTG GTGCTGGCCT TGCCAATAGA 
240 

GTGGTGAGCT TGGTCCTGGC GGCGCTTATT CAGTGCTTCG AATGGGAACG AGTTGGCGAA 
300 

GAATTGGTGG ACTTGTCCGA GGGGACGGGA CTCACAATGC CAAAGAGAGA GCCATTGGAG 
360 

GCCTTGTGCA AAGCGCGTGA ATGCATGATA GCTAATGTTC TTGCGCACCT TTAAGAAGGT 
420 

CGTTGTCTAA TGAATTTACA TTGGTGATGT ATCTCCAATG TTTTTGAATA ATCAAATAGA 
4 80 

CTGAAAATAG GCCAGTGCAG CTTTAGGAAT GATCGTGAGC ATCAATAGCA TCCTGAGGAG 
54 0 

GCCAATGCAG CTTTAGGCCT TTCTCTTAGG AGAAAAATGA TGGTTTATAT AGGTACTGGC 
600 

AACATTGTTC AAAAAAAAAA AA 
622 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CACGCTCGAC GAATTCGGTA CCCCGGGTTC GAAATCGATA AGCTTGGATC CAAAGCAACA 
60 

CATTGAACTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCCCCCACCC CCCCTTCCCA 
120 

ACCCCACCCA CATACAGACA AGTAGATACG CGCACACAGA AGAAGAAAAG ATGGGGGTTT 
180 

CAATGCAGTC AATCGCACTA GCGACGGTTC TGGCCGTCCT AACGACATGG GCGTGGAGGG 
240 

CGGTGAACTG GGTGTGGCTG AGGCCGAAGA GGCTCGAGAG GCTTCTGAGA CAGCAAGGTC 
300 

TCTCCGGCAA GTCCTACACC TTCCTGGTCG GCGACCTCAA -GGAGAACCTG CGGATGCTCA 
360 

AGGAAGCCAA GTCCAAGCCC ATCGCCGTCT CCGATGACAT CAAGCCTCGT CTCT 
414 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GAATTCGGCA CGAGTGTCTC TCTCTCTCTC TCTCTCTGTA AACCACCATG CTCTTCCTCA 

CTCATCTCCT AGCAGTTCTA GGGGTTGTGT TGCTCCTGCT AATTCTATGG AGGGCAAGAT 

1 CTTCTCCGAA CAAACCCAAA GGTACTGCCT TACCCCCGGA GCTGCCGGGC GCATGGCCGA 
ISO 

TCATAGGCCA CATCCACTTG CTGGGCGGCG AGACCCCGCT GGCCAGGACC CTGGCCGCCA 

2 TGGCGGACAA GCAGGGCCCG ATGTTTCGGA TCCGTCTCGG AGTCCACCCG GCGACCATCA 

3 TAAGCAGCCG TGAGGCGGTC CGGGAGTGCT TCACCACCCA CGACAAGGAC CTCGCTTCTC 
360 

GCCCCAAATC CAAGGCGGGA ATCCACTTGG GCTACGGGTA TGCCGGTTTT GGCTTCGTAG 
420 

AATACGGGGA CTTTTGGCGC GAGATGAGGA AGATCACCAT GCTCGAGCT 
469 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CGGGCTCGTG GCTCGGCTCC GGCGCAACGC CCTTCCCACC GGGCCCGAGG GGCCTCCCGG 
60 

TCATCGGGAA CATGCTCATG ATGGGCGAGC TCACCCACCG CGGCCTCGCG AGTCTGGCGA 

1 AGAAGTATGG CGGGATCTTC CACCTCCGCA TGGGCTTCCT GCACATGGTT GCCGTGTCGT 

1 CCCCCGACGT GGCCCGCCAG GTCCTCCAGG TCCACGACGG GATCTTCTCG AACCGGCCTG 

2 CCACCATCGC GATCAGCTAC CTCACGTATG ACCGGGCCGA CATGGCCTTC GCGCACTACG 
300 

GCCCGTTCTG GCGGCAGATG CGGAAGCTGT GCGTGATGAA A 
341 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

GAATTCGGCA CGAGCGGGCT CGTGGCTCGG CTCCGGCGCA ACGCCCTTCC CACCGGGCCC 

6 GAGGGGCCTC CCGGTCATCG GGAACATGCT CATGATGGGC GAGCTCACCC ACCGCGGCCT 
120 

CGCGAGTCTG GCGAAGAAGT ATGGCGGGAT CTTCCACCTC CGCATGGGCT TCCTGCACAT 

^ GG TTGCCGTG TCGTCCCCCG ACGTGGCCCG CCAGGTCCTC CAGGTCCACG ACGGGATCTT 

2 CTCGAACCGG CCTGCCACCA TCGCGATCAG CTACCTCACG TATGACCGGG CCGACATGGC 

3 CTTCGCGCAC TACGGCCCGT TCTGGCGGCA GATGCGGAAG CTGTGCGTGA TGAAAGCTCT 
360 
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TCAGCGGAAG CGGGCTGAGT CGTGGGA 
387 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 443 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CACGAGCTCG TGAGCCTTCC CGGAGACAAG GCCATCTTAC TTCGCAACAA ATTGCGTCCG 
60 

CACTCCTTTC TCAAGAAACC TAGTCATCCA AGAAGCAGAG CATTGCAACT GCAAACAGCC 
120 

AAAGCCCAAA CTCGTACAGA AGGAGAGAGA GAGAGAGAAT AGAAGCATGA GTGCATGCAC 
180 

GAACCAAGCA ATCACGACGG CCAGTGAAGA TGAAGAGTTC TTGTTCGCCA TGGAAATGAA 
240 

T.GCTCTGATA GCACTCCCCT TGGTCTTGAA GGCCACCATC GAACTGGGGA. TCCTCGAAAT 
300 

ACTGGCCGAG TGCGGGCCTA TGGCTCCACT TTCGCCTGCT CAGATTGCCT CCCGTCTCTC 
360 

CGCAAAGAAC CCGGAAGCCC CCGTAACCCT TGACCGGATC CTCCGGTTTC TCGCCAGCTA 
420 

CTCCATCCTC TCTTGCACTC TCG 
443 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 607 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GAATTCGGCA CGAGCCAACC CTGGACCAGG TACTTTTGGC AGGCGGTCCA TTGCCCTTCA 
60 

AACCGGTCCA AACCGGACCA TCACTGTCCT TATATACGTT £CATCATGCC TGCTCATAGA 
120 

ACTTAGGTCA ACTGCAACAT TTCTTGATCA CAACATATTA CAATATTCCT AAGCAGAGAG 
180 

AGAGAGAGAG AGAGAGAGAG AGAGAGAGAG AGAGTTTGAA TCAATGGCCA CCGOCGGAGA 
240 

GGAGAGCCAG ACCCAAGCCG GGAGGCACCA GGAGGTTGGC CACAAGTCTC TCCTTCAGAG 
300 

TGATGCTCTT TACCAATATA TTTTGGAGAC CAGCGTGTAC CCAAGAGAGC CTGAGCCCAT 
360 

GAAGGAGCTC AGGGAAATAA CAGCAAAACA TCCATGGAAC ATAATGACAA CATCAGCAGA 
420 

CGAAGGGCAG TTCTTGAACA TGCTTCTCAA GCTCATCAAA GCCAAGAACA CCATGGAGAT 
480 

TGGTGTCTTC ACTGGCTACT CTCTCCTCGC CACCGCTCTT CCTCTTCCTG AT G AC GG AAA 
540 

GATTTTGGCT ATGGACATTA AC AG AG AG AG CTATGAACTT -GGCCTGCCGG CATCCAAAAA 
600 

GCCGGTG 
607 



(2) INFORMATION FOR SEQ ID NO: 24: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GAATTCGGCA CGAGCCGTTT TATTTCCTCT GATTTCCTTT GCTCGAGTCT CGCGGAAGAG 
60 

AGAGAAGAGA GGAGAGGAGA GAATGGGTTC GACCGGATCC GAGACCCAGA TGACCCCGAC 
120 

CCAAGTCTCG GACGAGGAGG CGAACCTCTT CGCCATGCAG CTGGCGAGCG CCTCCGTGCT 
180 

CCCCATGGTC CTCAAGGCCG CCATCGAGCT CGACCTCCTC GAGATCATGG CCAAGGCCGG 
240 

GCCGGGCGCG TTCCTCTCCC CGGGGGAAGT CGCGGCCCAG CTCCCGACCC AGAACCCCGA 
300 

GGCACCCGTA ATGCTCGACC GGATCTTCCG GCTGCTGGCC AGCTACTCCG TGCTCACGTG 
360 

CACCCTCCGC GACCTCCCCG ATGGCAAGGT CGAGCGGCTC TACGGCTTAG CGCCGGTGTG 
420 

C 
421 

(2) INFORMATION FOR SEQ ID NO: 25: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 760 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

GGAAGAAGCC GAGCAAACGA ATTGCAGACG CCATTGAAAA AAGACACGAA AGAGATCAAG 

6 AAGGAGCTTA AGAAGCATCA TCAATGGCAG CCAACGCAGA GCCTCAGCAG ACCCAACCAG 

CGAAGCATTC GGAAGTCGGC CACAAGAGCC TCTTGCAGAG CGATGCTCTC TACCAGTATA 
180 

TATTGGAGAC CAGCGTCTAC CCAAGAGAGC CAGAGCCCAT GAAGGAGCTC AGGGAAATAA 

CAGCCAAACA TCCATGGAAC CTGATGACCA CATCGGCGGA TGAAGGGCAG TTCCTGAACA 
300 

TGCTCCTCAA GCTCATCAAC GCCAAGAACA CCATGGAGAT CGGCGTCTAC ACCGGCTACT 
360 

CTCTCCTCGC AACCGCCCTT GCTCTTCCCG ATGACGGAAA GATCTTGGCC ATGGCCATCA 

^ AT AGGGAGAA CTTCGAGATC GGGCTGCCCG TCATCCAGAA GGCCGGCCTT GCCCACAAGA 

4 TCGATTTCAG AGAAGGCCCT GCCCTGCCGC TCCTTGATCA GCTCGTGCAA GATGAGAAGA 

5 ACCATGGAAC GTACGACTTC TTCTCAATCC TTAATCGTTC ATTTGAATAC AAATACATGC 

6 TCAATGGTTC AAAGACAACA TAAGACAGAA GATGGAAAAA ATAGAAAGGA AGGAAAGTAT 
660 

TAAGGGTAGT TTCTCATTTC ATCAATGCTT GATTTTGAGA TCTCCTTTCT GGTGCGATCA 
720 

GCTGACCCGG CGGCACAGGT GATGCCATCC CCGACGGGAA 
760 

(2) INFORMATION FOR SEQ ID NO: 26: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 508 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GAATTCGGTA CCCGGGTTCG AAATCGATAA GCTTGGATCC AAAGAATTCG GCACGAGATC 
60 

ACTAACCATC TGCCTTTCTT CATCTTCTTT CTTCTGCTTC TCCTCCGTTT CCTCGTTTCG 
120 

ATATCGTGAA AGGAGTCCGT CGACGACAAT GGCCGAGAAG AGCAAGGTCC TGATCATCGG 
180 

AGGGACGGGC TACGTCGGCA AGTTCATCGT GGAAGCGAGT GCAAAAGCAG GGCATCCCAC 
240 

GTTCGCGCTG GTTAGGCAGA GCACGGTCTC CGACCCCGTC AAGGGCCAGC TCGTCGAGAG 
300 

CTTCAAGAAC TTGGGCGTCA CTCTGCTCAT CGGTGATCTG TACGATCATG AGAGCTTGGT 
360 

GAAGGCAATC AAGCAAGCCG ACGTGGTGAT ATCGACAGTG GGGCACATGC AAAT-GGCGGA 
4 20 

TCAGACCAAA GAATCGTCGA CGCCATTAAA GGAAGCTGGC AACGTTAAGG TTTGTTGGTT 
480 

GGTTCATTTG ATCTGGTTTG GGGGGGTC 
508 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 95 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



GAATTCGGCA CGAGGTTAAT 
60 

CTCCTCCCTT CTTCTTTCTC 
120 

AAGAGGAAGG TGGGGCAGCC 
180 

ATCCTCATCA TGGGAGGCAC 
240 

GAAGGTCATC AGGTCACTTT 
300 

GGTGAGTCGG ACAAGGACTT 
360 

AGAAAGGATT TTGATTTTGT 
420 

GACATTAACG GCGAGAGGCG 
480 

ACCAGTCAAC TACTG 
495 



GGCAGTGCAG CCTCAACACC 
TGACTTCAAT GGCAGCCGAC 
TAAAGGGGCA CTGCGGGTCA 
CCGTTTCATC GGTGTGTTTT 
GTTTACCAGA GGAAAAGCAC 
CGCTGATTTT TCATCCAAGA 
TAAATCTAGT CTTGCTGCAG 
GATGAAGTCG CACCAATTTT 



ACCCACCTTC CTCCATCTCT 
TCCATGCTTG CGTTCAGTAT 
CTGCATCAAG CAATAAGAAG 
TGTCGAGACT ACTTGTCAAA 
CCATCACTCA ACAATTGCCT 
TCCTGCATTT GAAAGGAGAC 
AAGGCTTTGA CGTTGTTTAT 
GGATGCCTGC CAAACCTTGA 



(2) INFORMATION FOR SEQ ID NO:28: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

GAATTCGGCA CGAGCATAAG CTCTCCCGTA ATCCTCACAT CACATGGCGA AGAGCAAGGT 

^CTCGTCGTT GGCGGCACTG GCTACCTCGG GCGGAGGTTC GTGAGGGCGA GCCTGGACCA 

"ggccacccc ACGTACGTCC TCCAGCGTCC GGAGACCGGC CTCGACATTG AGAAGCTCCA 

"acgctactg CGCTTCAAGA GGCGTGGCGC CCAACTCGTC GAGGCCTCGT TCTCAGACCT 

2 GAGGAGCCTC GTCGACGCTG TGAGGCGGGT CGATGTCGTC GTCTGTGCCA TGTCGGGGGT 

3 ?2acttccgg AGCCACAACA TCCTGATGCA GCTCAAGCTC GTGGAGGCTA TCAAAGAAGC 

3 ?GGAAATGTC AAGCGGTTTT TGCCGTCAGA GTTCGGAATG GACCCGGCCC TCATGGGTCA 

4 TGCAATTGAG CCGGGAAGGG TCACGTTCGA TGAGAAATGG AGGTGAGAAA AG 
472 

(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
GAATTCGGCA CGAGGAGGCA CCTCCTCGAA ACGAAGAAGA AGAAGGACGA AGGACGAAGG 
6 AGACGAAGGC GAGAATGAGC GCGGCGGGCG GTGCCGGGAA GGTCGTGTGC GTGACCGGGG 
"ctccggtta CATCGCCTCG TGGCTCGTCA AGCTCCTCCT CCAGCGCGGC TACACCGTCA 
'aggccaccgt CCGCGATCCG AATGATCCAA AAAAGACTGA ACATTTGCTT GGACTTGATG 
2 GAGCGAAAGA TAGACTTCAA CTGTTCAAAG CAAACCTGCT GGAAGAGGGT TCATTTGATC 
3 ??ATTGTTGA GGGTTGTGCA GGCGTTTTTC AAACTGCCTC TCCCTTTTAT CATGATGTCA 

3 AGGATCCGCA GGCAGAATTA CTTGATCCGG CTGTAA 
396 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 592 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GAATTCGGCA CGAGGTTGAA CCTCCCGTCC TCGGCTCTGC TCGGCTCGTC ACCCTCTTCG 

"gCTCCCGCA TACTCCACCA CCGCGTACAG AAGATGAGCT CGGAGGGTGG GAAGGAGGAT 

"gcctcggtt GGGCTGCCCG GGACCCTTCT GGGTTCCTCT CCCCCTACAA ATTCACCCGC 

^ AGGGCCGTGG GAAGCGAAGA CGTCTCGATT AAGATCACGC ACTGTGGAGT GTGCTACGCA 
240 
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GATGTGGCTT GGACTAGGAA TGTGCAGGGA CACTCCAAGT ATCCTCTGGT 3 CCAGGGC AC 
300 

GAGATAGT7G GAATTGTGAA ACAGGTTGGC TCCAGTGTCC AACGCTTCAA AGTTGGCGAT 
360 

CATGTGGGGG TGGGAACTTA TGTCAATTCA TGCAGAGAGT GCGAGTATTG CAATGACAGG 
420 

CTAGAAGTCC AATGTGAAAA GTCGGTTATG ACTTTTGATG GAATTGATGC AGATGGTACA 
480 

GTGACAAAGG GAGGATATTC TAGTCACATT GTCGTCCATG AAAGGTATTG C3TCAGGATT 
540 

CC AG AAAACT ACCCGATGGA TCTAGCAGCG CATTTGCTCT GTGCTGGATC AC 
592 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GAATTCGGCA CGAGAACTCA TCTTGAAATG TCATTGGAGT CATCATCCTC TAGTGAGAAG 
60 

AAACAAATGG GTTCCGCCGG ATTCGAATCG GCCACAAAGC CGCACGCCGT TTGCATTCCC 
120 

TACCCTGCAC AAAGCCACAT TGGCGCCATG CTCAAGCTAG CAAAGCTCCT CCATCACAAG 
180 

GGCTTCCACA TCTCCTTCGT CAACACCGAG TTCAACCACC GGCGGCTCGC CAGGGCTCGA 
240 

GGCCCCGAGT TCACAAATGG AATGCTGAGC GACTTTCAGT TCCTGACAAT CCCCGATGGT 
300 

CTTCCTCCTT CGGACTTGGA TGCGATCCAA GACATCAAGA TGCTCTGCGA ATCGTCCAGG 
360 

AACTATATGG TCAGCCCCAT CAACGATCTT GTATCGAGCC TGGGCTCGAA CCCGAGCGTC 
420 

CCTCCGGTGA CTTGCATCAA TCTCGGATGG TTTCATGACA CTCGTGAC 
468 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 05 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



CTTTACTCCG CCAAGAAGAT 
60 

GTCCATCTTC ATCGGGAAGT 
120 

AGCAAGCCCC TAACTCAGTG 
180 

CGGAATTTTC CGAAATAGCT 
240 

TTCGACCCGG GTCAGTGAGC 
300 

AGGCATTACA GGAGAGGGGG 
360 

ATCGGGCTGT CGGAGCGTTT 
405 



CCAATCGCAG TTTTCGCAAT 
CTCTTGGCAG AAGACCGGAG 
GTCTATGTGA GTCTTGGGAG 
TTAGGTTTAG CCGATAGCCA 
GGCTCGGAAC TCTTAGAGAA 
AAGATTGTGA AATGGGCGCC 
TGGACTCACA ATGGATGGAA 



TGGCCCATTA CACAAATGCG 
TTGCATTTCC TGGCTGGACA 
CATCGCCTCT GTGAACGAGT 
GCAGCCATTC TTGTGGGTGG 
TTTGCCCGGT TGCTTTCTGG 
TCAACATGAA GTGCTGGCTC 
CTCCA 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GGCAAACACG CCCGTTTTCG TTTTACTAAG AGAAGATGGT GAGCGTTGTG GCTGGTAGAG 

6 TCGAGAGCTT GTCGAGCAGT GGCATTCAGT CGATCCCGCA GGAGTATGTG AGGCCGAAGG 

^ AGGAGCTC AC AAGCATTGGC GACATCTTCG AGGAGGAGAA GAAGCATGAG GGCCCTCAGG 

l TCCCGACCAT CGACCTCGAG GACATAGCGT CTAAAGACCC CGTGGTGAGG GAGAGGTGCC 

2 ACGAGGAGCT CAGGAAGGCT GCCACCGACT GGGGCGTCAT GCACCTCGTC AACCATGGGA 

3 TCCCCAACGA CCTGATTGAG CGTGTAAAGA AGGCTGGCGA GGTGTTCTTC AACCTCCCGA 
360 

TCGAGGAGAA GGACAAGCAT 
380 

(2) INFORMATION FOR SEQ ID NO: 34 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 305 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TTGTACCCGA AGATCTCCGG GACCGTTCGA CGGCGACATC GCCGTCGGCC GGGAACCCGT 
6 CGAGGCCGCC GCCGGAGGCC GGGGAGAAGC TGGAGTAGCC GCCGTAGCCG GAGAAGGCGC 
"gtcgtggtc GGCGGCGGCG GCGTGGTGGA CCTCATCGCC GTCCATGCTG AAGGCGTCGA 
* AGGAAGCGGA CATGGCTGGG GGATCGATCG ACCGATCCGA TCGGCCGGAG GATTTCGAGA 

^ TCGGAGATGG AGAGATGGAA ATGAAAGAGA GAGAGAGAGA GAGATCCGGT GGACTGGTGG 

300 

TGTTT 
305 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GAATTCGGCA CGAGCTAAGA GAGGAGAGGA GAGGAGCAAG ATGGCACTAG CAGGAGCTGC 

6 ACTGTCAGGA ACCGTGGTGA GCTCCCCCTT TGTGAGGATG CAGCCTGTGA AC AG ACT C AG 
120 
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GGCATTCCCC AATGTGGGTC AGGCCCTGTT TGGTGTCAAC TCTGGCCGTG GCAGAGTGAC 
180 

TGCCATGGCC GCTTACAAGG TCACCCTGCT CACCCCTGAA GGCAAAGTCG AACTCGACGT 
240 

CCCCGACGAT GTTTACATCT TGGACTACGC CGAGGAGCAA GGCATCGACT TGCCCTACTC 
300 

CTGCCGTGCC GGCTCTTGCT CCTCCTGCGC GGGCAAGGTC GTGGCGGGGA GCGTCGACCA 
360 

GAGCGACGGC AGCTTCCTGG ATGATGATCA GATTGAGGAA GGTTGGGTCC TCACTTGTGT 
420 

CGCCTACCCT AAGTCTGAGG TCACCATTGA GACCCACAAG GAAGAGGAGC TCACTGCTTG 
480 

AAGCTCTCCT ATATTTGCTT TTGCATAAAT CAGTCTCACT CTACGCAACT TTCTCCACTC 
540 

TCTCCCCCCT TCACTACATG TTTGTTAGTT CCTTTAGTCT CTTCCTTTTT TACTGTACGA 
600 

GGGATGATTT GATGTTATTC TGAGTCTAAT GTAATGGCTT TTCTTTTTCC TATTTCTGTA 
660 

TGAGGAAATA AAACTCATGC TCTAAAAAAA AAA 
693 

(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 418 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AGGACTTTAT TATAAGCATT GTAAAAAGAG TCAAACTAAT ACATCGCAAG AATTGGGTTA 
60 

TCCAATAATC TACAAAAAGA AAAAAGTTTG ATGCATTGAG ATGGTAACTG CTTAATTCAA 
120 

ATGCCTTAGT TTGAAAAATT AACCAACTAT TAAAATTAAT GATGATGAAT ATGGATTATG 
180 

TGTGAAAAAC TATATAGACT TAAAATTGAC TCAGAAGACA TTCTTTTCTT CTTATTTTAT 
240 

GATATGATGA ATTCGGTCTA AACAGGCAAA TGGTGTCAAA CGGGAAGTCG GCAAAACTCT 
300 

TCCTCGGCAG TGACTACCGG GCGGGCGATG ATGCGGATCC GGGGGCCGGG TCGCTGGAGA 
360 

ACATCCCGCA CGGACCGGTC CACGTTTGGT GCGGTGACAA CAGGCAGCCC AACCTGGA 
418 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GAATTCGGCA CGAGCATACA ACTACACTGC GACGCCGCCG CAGAACGCGA GCGTGCCGAC 
60 

CATGAACGGC ACCAAGGTCT ACCGGTTGCC GTATAACGCT ACGGTCCAGC TCGTTTTACA 
120 

GGACACCGGG ATAATCGCGC CGGAGACCCA CCCCATCCAT CTGCACGGAT TCAACTTCTT 
180 

CGGTGTGGGC AAAGGAGTGG GGAATTATGA CCCAAAGAAG GATCCCAAGA AGTTCAATCT 
240 
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GGTTGACCCA GTGGAGAGGA ACACCATTGG AATCCCATCT GGTGGATGGA TAGCCATCAG 
300 

ATTCACAGCA GACAATCCAG GAGTTTGGTT CCTGCACTGC CATCTGGAAG TGCACACAAC 
360 

TTGGGGACTG AAGATGGCAT TCTTGGTGGA CAATGGGAAG GGGCCTAAAG AGACCCTGCT 
420 

TCCACCTCCA AGTGATCTTC CAAAATGTTG ATCATTTGAT CATGAGGACG ACAAGCGATT 
480 

ACTAATGACA CCAAGTTAGT GGAATCTTCT CTTTGAAAAA GAAGAAGAAG AGCAAGAAGA 
540 

ATAAGAAAGA TGAGGAGAGA AGCCATAGAA GATTTGACCA AGAAGAGAGA GGGCAATAAA 
600 

CCAAAGAGAC CCTTGAGATC ACGACATCCC GCAATTG7TT CTAGAGTAAT AGAAGGATTT 
660 

ACTCCGACAC TGCTACAATA AATTAAGGAA GACAAGGAAT TTGGTTTTTT TCATTGGAGG 
720 

AGTGTAATTT GTTTTTTGGC AAGCTCATCA CATGAATCAC AT G G AAAAAA AAAAAAA 
777 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

ATATGTTCAG AATTTCAAAT GTGGGAATGT CAACCTCCTT GAACTTCAGA ATTCAGGGCC 
60 

ATACGTTGAA GCTAGTCGAG GTTGAAGGAT CTCACACCGT CCAGAACATG TATGATTCAA 
120 

TCGATGTTCA CGTGGGCCAA TCCATGGCTG TCTTAGTGAC CTTAAATCAG CCTCCAAAGG 
180 

ACTACTACAT TGTCGCATCC ACCCGGTTCA CCAAGACGGT TCTCAATGCA ACTGCAGTGC 
240 

TACACTACAC CAACTCGCTT ACCCCAGTTT CCGGGCCACT ACCAGCTGGT CCAACTTACC 
300 

AAAAAC AT T G GTCCATGAAG CAAGCAAGAA CAATCAGGTG GAAC 
344 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 341 base pairs 
{ B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GCCGCAACTG CAATTCTCTT CGTAAAACAT GACGGCTGTC GGCAAAACCT CTTTCCTCTT 
60 

GGGAGCTCTC CTCCTCTTCT CTGTGGCGGT GACATTGGCA GATGCAAAAG TTTACTACCA 
120 

TGATTTTGTC GTTCAAGCGA CCAAGGTGAA GAGGCTGTGC ACGACCCACA ACACCATCAC 
180 

GGTGAACGGG CAATTCCCGG GTCCGACTTT GGAAGTTAAC GACGGCGACA CCCTCGTTGT 
240 

CAATGTCGTC AACAAAGCTC GCTACAACGT CACCATTCAC TGGCACGGCG TCCGGCAGGT 
300 

GAGATCTGGT TGGGCTGATG GGGCGGAATT TGTGACTCAA T 
341 
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(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 358 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

GAATTCGGCA CGAGATATGT TCAGAATTTC AAATGTGGGA ATGTCAACCT CCTTGAACTT 
60 

CAGAATTCAG GGCCATACGT TGAAGCTAGT CGAGGTTGAA GGATCTCACA CCGTCCAGAA 
120 

CATGTATGAT TCAATCGATG TTCACGTGGG CCAATCCATG GCTGTCTTAG TGACCTTAAA 
180 

TCAGCCTCCA AAGGACTACT ACATTGTCGC ATCCACCCGG TTCACCAAGA CGGTTCTCAA 
240 

TGCAACTGCA GTGCTACACT ACACCAACTC GCTTACCCCA GTTTCCGGGC CACTACCAGC 
300 

TGGTCCAACT TACCAAAAAC ATTGGTCCAT GAAGGAAGGA AGAACAAT-CA GGTGGAAC 
358 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 409 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ATCAAGAGTT TGAGTCTAAA CCTTGTCTAA TCCTCTCTCG CATAGTCATT TGGAGACGAA 
60 

TGCTGATCGG CCGCAGCTGC ATTCTCTTCG TAAAACATGA CGGCTGTCGG CAAAACCTCT 
120 

TTCCTCTTGG GAGCTCTCCT CCTCTTCTCT GTGGCGGTGA CATTGGCAGA TGCAAAAGTT 
180 

TACTACCATG ATTTTGTCGT TCAAGCGACC AAGGTGAAGA GGCTGTGCAC GACCCACAAC 
240 

ACCATCACGG TGAACGGGCA ATTCCCGGGT CCGACTTTGG AAGTTAACGA CGGCGACACC 
300 

CTCGTTGTCA ATGTCGTCAA CAAAGCTCGC TACAACGTCA CCATTCACTG GCACGGCGTC 
360 

CGGCAGGTGA GATCTGGTTG GGCTGATGGG GOGGAATTTG TGACTCAAT 
409 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CTCTCTCTCT CTCTCTCTCT GTGTGTTCAT TCTCGTTGAG CTCGTGGTCG CCTCCCGCCA 
60 

TGGATCCGCA CAAGTACCGT CCATCCAGTG CTTTCAACAC TTCTTTCTGG ACTACGAACT 
120 
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CTGGTGCTCC TGTCTGGAAC AATAACTCTT CGTTGACTGT TGGAAGCAGA GGTCCAATTC 
180 

TTCTTGAGGA TTATCACCTC GTGGAGAAAC TTGCCAACTT TGATAGGGAG AGGATTCCAG 
240 

AGCGTGTGGT GCATGCCAGA GGAGCCAGTG CAAAGGGATT CTTTGAGGTC ACTCATGACA 
300 

TTTCCCAGCT TACCTGTGCT GATTTCCTTC GGGCACCAGG AGTTCAAACA CCCGTGATTG 
360 

TCCGTTTCTC CACTGTCATC CACGAAAGGG GCAGCCCTGA AACCCTGAGG GACCCTCGAG 
420 

GTTTTGCTGT GAAGTTCTAC ACAAGAGAGG GTAACTTTGA TCTGGTGGGA AACAATTTCC 
480 

CTGTCTTCTT TGTCCGTAAT GGGATAAATT CCCCG 
515 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

GAATTCGGCA CGAGGCTCCC TCTCGTACTG CCATACTCCT GGGACGGGAT TCGGATAGGG 
60 

ATTTGCGGCG ATCCATTTCT CGATTCAAGG GGAAGAATCA TGGGGAAGTC CTACCCGACC 
120 

GTAAGCCAGG AGTACAAGAA GGCTGTCGAG AAATGCAAGA AGAAGTTGAG AGGCCTCATC 
180 

GCTGAGAAGA GCTGCGCTCC GCTCATGCTC CGCATCGCGT GGCACTCCGC CGGTACCTTC 
240 

GATGTGAAGA CGAAGACCGG AGGCCCGTTC GGGACCATGA AGCACGCCGC GGAGCTCAGC 
300 

CACGGGGCCA ACAGCGGGCT CGACGTTGCC GATCAGGTCT TGCAGCCGAT CAAGGATCAG 
360 

TTCCCCGTCA TCACTTATGC TGATTTCTAC CAGCTGGCTG GCGTCGTTGC TGTGGAAGTT 
420 

ACTGGTGGAC CTGAAGTTGC TTTTCACCCG GAAGAGAGGC AAACCACAAC C 
471 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 487 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GAATTCGGCA CGAGCTCCCA CTTCTGTCTC GCCACCATTA CTAGCTTCAA AGCCCAGATC 
60 

TCAGTTTCGT GCTCTCTTCG TCATCTCTGC CTCTTGCCAT GGATCCGTAC AAGTATCGCC 
120 

CGTCCAGCGC TTACGATTCC AGCTTTTGGA CAACCAACTA CGGTGCTCCC GTCTGGAACA 
180 

ATG ACT CATC GCTGACTGTT GGAACTAGAG GTCCGATTCT CCTGGAGGAC TACCATCTGA 
240 

TTGAGAAACT TGCCAACTTC GAGAGAGAGA GGATTCCTGA GCGGGTGGTC CATGCACGGG 
300 

GAGCCAGCGC GAAAGGGTTC TTCGAGGTCA CCCACGACAT CTCTCACTTG ACCTGTGCTG 
360 
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ATTTCCTCCG GGCTCCTGGA GTCCAGACGC CCGTAATCGT CCGTTTCTCC ACCGTCATCC 
420 

ACGAGCGCGG CAGCCCGAAC CTCAGGGACC CTCGTGGTTT TGCAGTGAAG TTCTACACCA 
480 

GAGAGGG 
487 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAATTCCTGC AGCCCGGGGG 
60 

GCGCGCCTGC AGGTCGACAC 
120 

TTGTTGGACG CCATGGAAGC 
180 

CCCAAGGAAG GACTGGCTCT 
240 

GTCTGTTTTG ACGCCAACGT 
300 

GAGGTGATGC AAGGGAAACC 
360 

CCAGGGCAGA TCGAAGCCGC 
420 

AAAGAAGCAG CGCGGCTTCA 
480 

GCTCTGCGAA CATCGCCACA 
540 

CACTCCATCG AGCGGGAGAT 
600 

GACATGGCTC TCCACGGCGG 
660 

ATGCGAATCT CTTTGGCAGC 
684 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

GAATTCGGCA CGAGGACAAG "GTCATAGGCC CTCTCTTCAA ATGCTTGGAT GGGTGGAAAG 
60 

GAACTCCTGG CCCATTCTGA AATAAATAAT CTTCCAAGAT CGCCTTTATA CAACGACTGC 
120 

TATGATTTGA GTCCTCGGAT CTTTTTGTTG ATGCAGTTGT TTACCGATCT GGAATTTGAT 
180 

TGGTCATAAA GCTTGATTTT GTTTTTCTTT CTTTTGTTTT ATACTGCTGG ATTTGCATCC 
240 

CATTGGATTT GCCAGAAATA TGTAAGGGTG GC AG ATCATT TGGGTGATCT GAAACATGTA 
300 

AAAGTGGCGG ATCATT TGGG TAGCATGCAG ATCAGTTGGG TGATCGTGTA CTGCTTTCAC 
360 
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TATTACTTAC AT ATT TAP-AG ATCGGGAATA AAAACATGAT TTTAATTGAA AAAAAAAA 
418 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 479 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

GATATCCCAA CGACCGAAAA CCTGTATTTT CAGGGCGCCA TGGGGATCCG GAATTCGGCA 
60 

CGAGCAAGGA AGAAAATATG GTTGCAGCAG CAGAAATTAC GCAGGCCAAT GAAGTTCAAG 
120 

TTAAAAGCAC TGGGCTGTGC ACGGACTTCG GCTCGTCTGG CAGCGATCCA CTGAACTGGG 
180 

TTCGAGCAGC CAAGGCCATG GAAGGAAGTC ACTTTGAAGA AGTGAAAGCG ATGGTGGATT 
240 

CGTATTTGGG AGCCAAGGAG ATTTCCATTG AAGGGAAATC TCTGACAATC TCAGACGTTG 
300 

CTGCCGTTGC TCGAAGATCG CAAGTGAAAG TGAAATTGGA TGCTGCGGCT GCCAAATCTA 
360 

GGGTCGAGGA GAGTTCAAAC TGGGTTCTCA CCCAGATGAC CAAGGGGACG GATACCTATG 
420 

GTGTCACTAC TGGTTTCGGA GCCACTTCTC ACAGGAGAAC GAACCAGGGA GCCGAGCTT 
479 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1785 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

TATCGATAAG CTTGATATCG AATTCCTGCA GCCCGGGGGA TCCACTAGTT CTAGAGCGGC 
60 

CGCCACCGCG GTGGAGCTCG CGCGCCTGCA GGTCGACACT AGTGGATCCA AAGAATTCGG 
120 

CACGAGGTTG CAGGTCGGGG ATGATTTGAA TCACAGAAAC CTCAGCGATT TTGCCAAGAA 
180 

ATATGGCAAA ATCTTTCTGC TCAAGATGGG CCAGAGGAAT CTTGTGGTAG TTTCATCTCC 
240 

CGATCTCGCC AAGGAGGTCC TGCACACCCA GGGCGTCGAG TTTGGGTCTC GAACCCGGAA 
300 

CGTGGTGTTC GATATCTTCA CGGGCAAGGG GCAGGACATG GTGTTCACCG TCTATGGAGA 
360 

TCACTGGAGA AAGATGCGCA G GAT CAT G AC TGTGCCTTTC TTTACGAATA AAGTTGTCCA 
420 

GC ACT AC AG A TTCGCGTGGG AAGACGAGAT CAGCCGCGTG GTCGCGGATG TGAAATCCCG 
480 

CGCCGAGTCT TCCACCTCGG GCATTGTCAT CCGTAGGCGC CTCCAGCTCA TGATGTATAA 
540 

TATTATGTAT AGGATGATGT TCGACAGGAG ATTCGAATCC GAGGACGACC CGCTTTTCCT 
600 

CAAGCTCAAG GCCCTCAACG GAGAGCGAAG TCGATTGGCC CAGAGCTTTG AGTACAATTA 
660 

TGGGGATTTC ATTCCCATTC TTAGGCCCTT CCTCAGAGGT TATCTCAGAA TCTGCAATGA 
720 
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GATTAAAGAG AAACGGCTCT CTCTTTTCAA GGACTAC7TC GTGGAAGAGC GCAAGAAGC~ 
780 

CAACAGTACC AAG AC TAG T A CCAACACCGG GGG AG CTCAA GTGTGCAATG GACCATATTT 
840 

TAGATGCTCA GGACAAGGGA GAGATCAATG AGGATAATGT TTTGTACATC GTTGAGAACA 
900 

TCAACGTTGC AGCAATTGAG ACAACGCTGT GGTCGATGGA ATGGGGAATA GCGGAGCTGG 
960 

TGAACCACCA GGACATTCAG AGCAAGGTGC GCGCAGAGCT GGACGCTGTT CTTGGACCAG 
1020 

GCGTGCAGAT AACGGAACCA GACACGACAA GG7TGCCCTA CCTTCAGGCG G7TGTGAAGG 
1080 

AAACCCTTCG TCTCCGCATG GCGATCCCGT TGCTCGTCCC CCACATGAAT CTCCACGACG 
1140 

CCAAGCTCGG GGGCTACGAT ATTCCGGCAG AGAGCAAGAT CCTGGTGAAC GCCTGGTGGT 
1200 

TGGCCAACAA CCCCGCCAAC TGGAAGAACC CCGAGGAGTT CCGCCCCGAG CGGTTCTTCG 
1260 

AGGAGGAGAA GCACACCGAA GCCAATGGCA ACGACTTCAA ATTCCTGCCT TCGGTGTGGG 
1320 

GAGGAGGAGC TGCCCGGGAA TCATTCTGGC GCTGCCTCTC CTCGCACTCT CCATCGGAAG 
1380 

ACTTGTTCAG AACTTCCACC TTCTGCCGCC GCCCGttr&G n n rpjihQ XGG "GTCAG7GA 
1440 

GAAGGGCGGG CAGTTCAGCC TTCACATTCT CAACCATTCT CTCATCGTCG CGAAGCCCAT 
1500 

AGCTTCTGCT TAATCCCAAC TTGTCAGTGA CTGGTATATA AATGCGCGCA CCTGAACAAA 
1560 

AAACACTCCA TCTATCATGA CTGTGTGTGC GTGTCCACTG TCGAGTCTAC TAAGAGCTCA 
1620 

TAGCACTTCA AAAGTTTGCT AGGATTTCAA TAACAGACAC CGTCAATTAT GTCATGTTTC 
1680 

AATAAAAGTT TGCATAAATT AAATGATATT TCAATATACT ATTTTGACTC TCCACCAATT 
1740 

GGGGAATTTT ACTGCTAAAA AAAAAAAAAA AAAAAAAAAA AAAAA 
1785 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

GAATTCGGCA CGAGATTTCC ATGGACGATT CCGTTTGGCT TCAATTCGTT TCCTCTGGCT 
60 

GTCCTCGTCC TCGTTTTCCT TGTTCTTCCT CCGACTTTTT CTCTGGAAGC TATGGCGTAA 
120 

TAGGAACCTG CCGCCAGGAC CCCCGGCATG GCCGATCGTA GGGAACGTCC TTCAGATTGG 
180 

ATTTTCCAGC GGCGCGTTCG AGACCTCAGT GAAGAAATTC CATGAGAGAT ACGGTCCAAT 
240 

ATTCACTGTG TGGCTCGGTT CCCGCCCTCT GCTGATGATC ACCGACCGCG AGCTTGCCCA 
300 

CGAGGCGCTC GTACAGAAGG GCTCCGTCTT CGCTGAC03C CCGCCCGCCC TCGGGATGCA 
360 

GAAAATCTTC AGTAGCAACC AGCACAACAT CACTTCGGCT GAATACGGCC CGCTGTGGCG 
420 

GAGCCTTCGC AGGAATCTGG TTAAAGAAGC CCTGAGACTT CGGCGATGAA GGCTT 
475 



12) INFORMATION FOR SEQ ID NO: 50: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 801 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GCTCCACCGA CGGTGGACGG TCCGCTACTC AGTAACTGAG TGGGATCCCC CGGGCTGACA 
60 

GGCAATTCGA TTTAGCTCAC TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCGGCT 
120 

CGTATGTTGT GTGGAATTGT GAGCGGATAA CAATTTCACA CAGGAAACAG CTATGACCAT 
180 

GATTACGCCA AGCGCGCAAT TAACCCTCAC TAAAGGGAAC AAAAGCTGGA GCTCCACCGC 
240 

GGTGGCGGCC GCTCTAGAAC TAGTGGATCC AAAGAATTCG GCACGAGACC CAGTGACCTT 
300 

CAGGCCTGAG AGATTTCTTG AGGAAGATGT TGATATTAAG GGCCATGATT ACAGGCTACT 
360 

GCCATTCGGT GCAGGGCGCA GGATCTGCCC TGGTGCACAA TTGGGTATTA ATTTAGTTCA 
420 

GTCTATGTTG GGACACCTGC TTCATCATTT CGTATGGGCA CCTCCTGAGG GAATGAAGGC 
480 

AGAAGACATA GATCTCACAG AGAATCCAGG GCTTGTTACT TTCATGGCCA AGCCTGTGCA 
540 

GGCCATTGCT ATTCCTCGAT TGCCTGATCA TCTCTACAAG CGACAGCCAC TCAATTGATC 
600 

AATTGATCTG ATAGTAAGTT TGAATTTTGT TTTGATACAA AACGAAATAA CGTGCAGTTT 
660 

CTCCTTTTCC ATAGTCAACA TGCAGCTTTC TTTCTCTGAA GCGCATGCAG CTTTCTTTCT 
720 

CTGAAGCCCA ACTTCTAGCA AGCAATAACT GTATATTTTA GAACAAATAC CTATTCCTCA 
780 

AATTGAGTAT TTCTCTGTAG G 
801 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 744 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 



GGGCCCCCCT TCGAGGTGGA 
60 

AAGGACGCTG TGCTTGAAGG 
120 

GAGTACCCGG CCATCGATCA 
180 

TCTACCATGT TGATGAACAA 
240 

TTGGTGGATG TGGGAGGAGG 
300 

CACATTTCAG GAATCAACTT 
360 

GCTGTGAAAC ATGTGGGTGG 
420 

ATGAAGTGGA TTCTGCATGA 
480 



CACTAGTGGA TCCAAAGAAT 
CTCCCAGCCA TTCACCAAAG 
GAGATTCAAC AAGATTTTCA 
GATTTTGGAT ACTTACGAGG 
TATTGGGTCG ACTCTCAATC 
CGACTTGTCC CATGTGCTGG 
AGACATGTTT GATAGTGTAC 
TTGGAGCGAT GATCATTGCA 

50 



TCGGCACGAG GTTTTATCTG 
CCCATGGAAT GAATGCGTTC 
ACAGGGCTAT GTCTGAGAAT 
GTTTTAAGGA GGTTCAGGAG 
TCATAGTGTC TAGGTATCCC 
CCGATGCTCC TCACTACCCA 
CAAGTGGCCA AGCTATTTTT 
GGAAGCTTTT GAAGAATTGT 
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CACAAGGCGT TGCCAGAGAA GGGGAAGGTG ATTGCGGTGG ACACCATTCT CCCAGTGGCT 
540 

GCAGAGACAT CTCCTTATGC TCGTCAGGGA TTTCATACAG ATTTACTGAT GTTGGCA7AC 
600 

AACCCAGGGG GCAAGGAACG CACAGAGCAA GAATTTCAAG ATTTAGCTAA GGAGACGGGA 
660 

TTTGCAGGTG GTGTTGAACC TGTATGTTGT GTCAATGGAA TGTGGGTAAT GGAATTCCTG 
720 

CAGCCCGGGG GATCCACTAG TTCT 
744 

(2) INFORMATION FOR SEQ ID NO: 52: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

GTGGCCCTGG AAGTAGTGIG CGCGACATGG AT-TCCTTGAA TTTGAACGAG TTTATGTTGT 
60 

GGTTTCTCTC TTGGCTTGCT CTCTACATTG GATTTCGTTA TGTTTTGAGA TCGAACTTGA 
120 

AGCTCAAGAA GAGGCGCCTC CCGCCGGGCC CATCGGGATG GCCAGTGGTG GGAAGTCTGC 
180 

CATTGCTGGG AGCGATGCCT CACGTTACTC TCTACAACAT GTATAAGAAA TATGGCCCCG 
240 

TTGTCTATCT CAAACTGGGG ACGTCCGACA TGGTTGTGGC CTCCACGCCC GCTGCAGCTA 
300 

AGGCGTTTCT GAAGACTTTG GATATAAACT TCTCCAACCG GCCGGGAAAT GCAGGAGCCA 
360 

CGTACATCGC CTACGATTCT CAGGACATGG TGTGGGCAGC GTATGGAGGA CGGTGGAAGA 
420 

TGGAGC 
426 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 62 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 

CAGTTCGAAA TTAACCTCAC TAAAGGGAAC AAAAGCTGGA GTTCGCGCGC CTGCAGGTCG 
60 

ACACTAGTGG ATCCAAAGAA TTCGGCACGA GCTTTGAGGC AACCTACATT CATTGAATCC 
120 

CAGGATTTCT TCTTGTCCAA ACAGGTTTAA GGAAATCGCA GGCACAAGTG TTGCTGCAGC 
180 

AGAGGTGAAG GCTCAGACAA CCCAAGCAGA GGAGCCGGTT AAGGTTGTCC GCCATCAAGA 
240 

AGTGGGACAC AAAAGTCTTT TGCAGAGCGA TGCCCTCTAT CAGTATATAT TGGAAACGAG 
300 

CGTGTACCCT CGTGAGCCCG AGCCAATGAA GGAGCTCCGC GAAGTGACTG CCAAGCATCC 
360 

CTGGAACCTC ATGACTACTT CTGCCGATGA GGGTCAATTT CTGGGCCTCC TGCTGAAGCT 
420 

CATTAACGCC AAGAACACCA TGGAGATTGG GGTGTACACT GGTTACTCGC TTCTCAGCAC 
480 



51 
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AGCCCTTGCA TTGCCCGATG ATGGAAAGAT TCTAGCCATG GACATCAACA GAGAGAACTA 
540 

TGATATCGGA TTGCCTATAA TT 
562 

(2) INFORMATION FOR SEQ ID NO: 5*1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

TCGTGCCGCT CGATCCTCAC AGGCCCTTTT TATTTCCCTG GTGAACGATA CGATGGGCTC 
60 

GCACGCTGAG AATGGCAACG GGGTGGAGGT TGTTGATCCA ACGGACTTAA CTGACATCGA 
120 

GAATGGGAAA CCAGGTTATG ACAAGCGTAC GCTGCCTGCG GACTGGAAGT TTGGAGTGAA 
180 

GCTTCAAAAC GTTATGGAAG AATCCATTTA CAAGTACATG CTGGAAACAT TCACCCGCCA 
240 

TCGAGAGGAC GAGGCGTCCA AGGAGCTCTG GGAACGAACA TGGAACCTGA CACAGAGAGG 
300 

GGAGATGATG ACATTGCCAG ATCAGGTGCA GTTCCTGCGC TTGATGGTAA AGATGTCAGG 
360 

TGCTAAAAAG GCATTGGAGA TCGGAGTTTT CACTGGCTAT TCATTGCTCA ATATCGCTCT 
420 

CGCTCTTCCT TCTGATGGCA AGGTGGTAGC TGTGGATCCA GGAGATGACC CCAAATTTGG 
480 

CTGGCCCTGC TTCGTTAAGG CTGGAGTTGC AGACAAAGTG GAGATCAAGA AAAC T AC AG G 
540 

GTTGGACTAT TTGGATTCCC TTATTCAAAA GGGGGAGAAG GATTGCTTCG ACTTTGCATT 
600 

CGTGGACGCA GACAAAGTGA ACTACGTGAA CTATCATCCA CGGCTGATGA AGTTAGTGCG 
660 

CGTGGGGGGC GTCATAATTT ACGACGACAC CCTCTGGTTT GGTCTGGTGG GAGGAAAGGA 
720 

TCCCCACAAC CTGCTTAAGA ATGATTACAT GAGGACTTCT CTGGAGGGTA TCAAGGCCAT 
780 

CAACTCCATG GTAGCCAACG ACCCCAACTT GGAGGTCGCC ACAGTCTTTA TGGGATATGG 
840 

TGTCACTGTT TGTTACCGCA CTGCTTAGTT AGCTAGTCCT CCGTCATTCT GCTATGTATG 
900 

TATATGATAA TGGCGTCGAT TTCTGATATA GGTGGTTTTT CAATGTTTCT ATCGTCATGT 
960 

TTTCTGTTTA GCCAGAATGT TTCGATCGTC ATGGTTTCTG TTAAAGCCAG AATAAAATTA 
1020 

GCCGCTTGCA GTTCAAAAAA AAAAAAAAAA AAAAACTCGA GACTAGTTCT CTTC 
1074 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1075 base pairs 

(B) TYPE: nucleic acid 
(C} STRANDEDNESS: single 
( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

TCGGAGCTCT CGAATCCTCA CAGGCCCTTT TTATTTCCCT GGTGAACGAT ACGATGGGCT 
60 
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CGCACGCTGA GAATGGCAAC GGGGTGGAGG TTGTTGATCC AACGGACTTA ACTGACATCG 
120 

AAGAATGGGA AACCAGGTTA TGACAAGCGT CGCTGCCTGC GGACTGGAAG TTYGGAGTGA 
180 

AGCTTCAAAA CGTTATGGAA GAATCCATTT ACAAGTACAT GCTGGAAACA TTCACCCGCC 
240 

ATCGAGAGGA CGAGGCGTCC AAGGAGCTCT GGGAACGAAC ATGGAACCTG ACACAGAGAG 
300 

GGGAGATGAT GACATTGCCA GATCAGGTGC AGTTCC7GCG CTTGATGGTA AAGATGTCAG 
360 

GTGCTAAAAA GGCATTGGAG ATCGGAGTTT TCACTGGCTA TTCATTGCTC AATATCGCTC 
420 

TCGCTCTTCC TTCTGATGGC AAGGTGGTAG CTGTGGATCC AGGAGATGAC CCCAAATTTG 
480 

GCTGGCCCTG CTTCGTTAAG GCTGGAGTTG CAGACAAAGT GGAGATCAAG AAAACTACAG 
540 

GGTTGGACTA TTTGGATTCC CTTATTCAAA AGGGGGAGAA GGATTGCTTC GACTTTGCAT 
600 

TCGTGGACGC AGACAAAGTG AACTACGTGA ACTATCATCC ACGGCTGATG AAGTTAGTGC 
660 

GCGTGGGGGG CGTCATAATT TACGACGACA CCCTCTGGTT TGGTCTGGTG GGAGGAAAGG 
720 

ATCCCCACAA CCTGCTTAAG AATGATTACA TGAGGACTTC TCTGGAGC3GT ATCAAGGCCA 
780 

TCAACTCCAT GGTAGCCAAC GACCCCAACT TGGAGGTCGC CACAGTCTT7 ATGGGATATG 
8 40 

GTGTCACTGT TTGTTACCGC ACTGCTTAGT TAGCTAGTCC TCCGTCATTC TGCTATGTAT 
900 

GTATATGATA ATGGCGTCGA TTTCTGATAT AGGTGGTTTT TCAATGTTTC TATCGTCATG 
960 

TTTTCTGTTT AGCCAGAATG TTTCGATCGT CATGGTTTCT GTTAAAGCCA GAATAAAATT 
1020 

AGCCGCTTGC AGTTCAAAAA AAAAAAAAAA AAAAAACTCG AGACTAGTTC TCTTC 
1075 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1961 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



GTTTTCCGCC 


ATTTTTCGCC 


60 




AATCAATTGA 


AAGGTTTTTA 


120 




GGTCGAGCAT 


CTGTACAGAT 


180 




TCATTCGTAT 


TGCTTTGAGA 


240 




GGCGACAGAC 


AGAACTTATT 


300 




CGGTCTGGCG 


AAGCTCGGGT 


360 




CATCGAATTT 


GCGTTTGTGT 


420 




CAATCCTTTC 


TACAAGCCGG 


480 




TCATAGTTAC 


CCTGGCAGCT 


540 




TCGTCATCAC 


AATCGATGAT 


600 





TGTTTCTGCG GAGAATTTGA 
TTTTCAGTAT TTCGATCGCC 
CGAAGCTTCC CGATATCGAG 
GAGTAGCGGA ATTCGCAGAC 
GCTTTTCAGA GGTGGAACTG 
TGCAGCAGGG GCAGGTTGTC 
TCATGGGGGC CTCTGTCCGG 
GCGAGATCGC CAAACAGGCC 
TATGTGGAGA AACTGGCCGA 
GCTCCCAAGG AAGGTTGCCA 



TCAGGTTCGG ATTGGGATTG 
ATGGCCAACG GAATCAAGAA 
ATCTCCGACC ATCTGCCTCT 
AGACCCTGTC TGATCGATGG 
ATTTCTCGCA AGGTCGCTGC 
ATGCTTCTCC TTCCGAATTG 
GGCGCCATTG TGACCACGGC 
AAGGCCGCGG GCGCGCGCGA 
TCTGCAGAGC CACGATGTGC 
ACATATTTCC CTTCTGACCG 
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AAGCCGACGA AACCCAATGC CCGGCCGTGA CAATCCACCC GGACGATGTC GTGGCGTTGC 
660 

CCTATTCTTC CGGAACCACG GGGCTCCCCA AGGGCGTGAT GTTAACGCAC AAAGGCCTGG 
"gTCCAGCGT TGCCCAGCAG GTCGATGGTG AAAATCCCAA TCTGTATTTC CATTCCGATG 
7 ACGTGATACT CTGTGTCTTG CCTCTTTTCC ACATCTATTC TCTCAATTCG GTTCTCCTCT 
"cGCGCTCAG AGCCGGGGCT GCGACCCTGA TTATGCAGAA ATTCAACCTC ACGACCTGTC 
9 TGGAGCTGAT TCAGAAATAC AAGG7TACCG TTGCCCCAAT TGTGCCTCCA ATTGTCCTGG 
ACATCACAAA GAGCCCCATC GTTTCCCAGT ACGATGTCTC GGCCGTCCGG ATAATCATGT 
"cGGCGCTGC GCCTCTCGGG AAGGAACTCG AAGATGCCCT CAGAGAGCGT 7TTCCCAAGG 
l CCATTTTCGG GCAGGGCTAC GGCATGACAG . AAGCAGGCCC GGTGCTGGCA ATGAACCTAG 
1 CCTTCGCAAA GAATCCTTTC CCCGTCAAAT CTGGCTCCTG CGGAACAGTC GTCCGGAACG 
^ CTC AAATAAA GATCCTCGAT ACAGAAACTG GCGAGTCTCT CCCGCACAAT CAAGCCGGCG 
1 AAATCTGCAT CCGCGGACCC GAAATAATGA AAGGATATAT TAACGACCCG GAATCCACGG 
^ CCGCT ACAAT CGATGAAGAA GGCTGGCTCC ACACAGGCGA CGTCGGGTAC ATTGACGATG 
"cGAAGAAAT CTTCATAGTC GACAGAGTAA AGGAGATTAT CAAATATAAG GGCTTCCAGG 
"ggCTCCTGC TGAGCTGGAA GCTTTACTTG TTGCTCATCC GTCAATCGCT GACGCAGCAG 
"cGTTCCTCA AAAGCACGAG GAGGCGGGCG AGGTTCCGGT GGCGTTCGTG GTGAAGTCGT 
^GGAAATCAG CGAGCAGGAA ATCAAGGAAT TCGTGGCAAA GCAGGTGATT TTCTACAAGA 
^ AAA TACAC AG AGTTTACTTT GTGGATGCGA TTCCTAAGTC GCCGTCCGGC AAGATTCTGA 
^ G AAAGGATTT GAGAAGCAGA CTGGCAGCAA AATGAAAATG AATTTCCATA TGATTCTAAG 
^ ATTCCTTTGC CGATAATTAT AGGATTCCTT TCTGTTCACT TCTATTTATA TAATAAAGTG 
"iGCAGAGTA AGCGCCCTAT AAGGAGAGAG AGAGCTTATC AATTGTATCA TATGGATTGT 

"aacgcccta cactcttgcg atcgctttca atatgcatat tactataaac GATATATGTT 

1920 

TTTTTTATAA ATTTACTGCA CTTCTCGTTC AAAAAAAAAA A 
1961 

(2) INFORMATION FOR SEQ ID NO: 57: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

GACAAACTTG GTCGTTTGTT TAGGTTTTGC TGCAGGTGAA CACTAATATG GAAGGCCAGA 

6 ?TGCAGCATT AAGCAAAGAA GATGAGTTCA TTTTTCACAG CCCTTTTCCT GCAGTACCTG 

"tCCAGAGAA TATAAGTCTT TTCCAGTTTG TTCTGGAAGG TGCTGAGAAA TACCGTGATA 

"ggtggccct CGTGGAGGCC TCCACAGGGA AGGAGTACAA CTATGGTCAG GTGATTTCGC 
240 

54 
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TCACAAGGAA TGTTGCAGCT GGGCTCGTGG AGAAAGGCAT TCAAAAGGGC GATGTTGTAT 
300 

TTGTTCTGCT TCCAAATATG GCAGAATACC CCATTATTGT GCTGGGAATA ATGTTGGCCG 
360 

GCGCAGTGTT TTCTGGGGCA AATCCTTCTG CACACATCAA TGAAGTTGAA AAACATATCC 
420 

AGGATTCTGG AGCAAAGATT GTTGTGACAG TTGGGTCTGC TTATGAGAAG GTGAGGCAAG 
480 

TGAAACTGCC TGTTATTATT GCAGATAACG AGCATGTCAT GAACACAATT CCATTGCAGG 
540 

AAATTT7TGA GAGAAACTAT GAGGCCGCAG GGCCTTTTGT ACAAATTTGT CAGGATGATC 
600 

TGTGTGCACT CCCTTATTCC TCTGGCACCA CAGGGGCCTC TAAAGGTGTC ATGCTCACTC 
660 

ACAGAAATCT GATTGCAAAT CTGTGCTCTA GCTTGTTTGA TGTCCATGAA TCTCTTGTAG 
720 

GAAATTTCAC CACGTTGGGG CTGATGCCAT TCTTTCACAT ATATGGCATC ACGGGCATCT 
780 

GTTGCGCCAC TCTTCGCAAC GGAGGCAAGG TCGTGGTCAT GTCCAGATTC GATCTCCGAC 
840 

ACTTTA7CAG TTCTTTGATT ACTTATGAGG TCAACTTCGC GCCTATTGTC CCGCCTATAA 
900 

TGCTCTCCCT CCGGT7TAAA AATCCTATCG TTAACGAGTT CGATC-TCAGC CGG-TTGAAAC 
960 

TCCAAAGCTG TTCATGACTG CGGCTGCTCC ACTGGCGCCG GATCTACTGC 
1010 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 741 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

GAATTCGGCA CGAGACCATT TCCAGCTAAT ATTGGCATAG CAATTGGTCA TTCTATCTTT 
60 

GTCAAAGGAG ATCAAACAAA TTTTGAAATT GGACCTAATG GTGTGGAGGC TAGTCAGCTA 
120 

TACCCAGATG TGAAATATAC CACTGTCGAT GAGTACCTCA GCAAATTTGT GTGAAGTATG 
180 

CGAGATTCTC TTCCACATGC TTCAGAGATA CATAACAGTT TCAATCAATG TTTGTCCTAG 
240 

GCATTTGCCA AATTGTGGGT TATAATCCTT CGTAGGTGTT TGGCAGAACA GAACCTCCTG 
300 

TTTAGTATAG TATGACGAGC TAGGCACTGC AGATCCTTCA CACTTTTCTC TTCCATAAGA 
360 

AACAAATACT CACCTGTGGT TTGTTTTCTT TCTTTCTGGA ACTTTGGTAT GGCAATAATG 
420 

TCTTTGGAAA CCGCTTAGTG TGGAATGCTA AGTACTAGTG TCCAGAGTTC TAAGGGAGTT 
480 

CCAAAATCAT GGCTGATGTG AACTGGTTGT TCCAGAGGGT GTTTACAACC AACAGTTGTT 
540 

CAGTGAATAA TTTTGTTAGA GTGTTTAGAT CCATCTTTAC AAGGCTATTG AGTAAGGTTG 
600 

GTGTTAGTGA ACGGAATGAT GTCAAATCTT GATGGGCTGA CTGACTCTCT TGTGATGTCA 
660 

AATCTTGATG GATTGTGTCT TTTTCAATGG TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
720 

AAAAAAAAAA AAAAAAAAAA A 
741 

(2) INFORMATION FOR SEQ ID NO: 59: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

CTCATCTCGG AGTTGCAGGC TGCAGCTTTT GGCCCAAAGC ATGATATCAG ATCAAACGAC 

6 GCAGATGAAG CAAACGGATC AAACAGTTTG CGTTACTGGA GCAGCGGGTT TCATTGCCTC 

ATGGCTTGTC AAGATGCTCC TCATCAGAGG TTACACTGTC AGAGCAGCAG TTCGGACCAA 

"cCAGCTGAT GATAGGTGGA AGTATGAGCA TCTGCGAGAG TTGGAAGGAG CAAAAGAGAG 

2 GCTTGAGCTT GTGAAAGCTG ATATTCTCCA TTACCAGAGC TTACTCACAG TCATCAGAGG 

3 TTGCCACGGT GTCTTTCACA TGGCTTCAGT TCTCAATGAT GACCCTGAGC AAGTGATAGA 

3 ACCAGCAGTC GAAGGGACGA GGAATGTGAT GGAGGCCTGC GCAGAAACTG GGGTGAAGCG 

4 CGTTGTTTTT ACTTCTTCCA TCGGCGCAGT TTACATGAAT CCTCATAGAG ACCCGCTCGC 

4 GATTGTCCAT GATGACTGCT GGAGCGATTT GACTACTGCG TACAAACCAA GAATTGGTAT 

^ TGCTATGCAA AAACCTTGGC AGAGAAATCT GCATGGGATA TTGCTAAGGG AAGGAATTTA 

6 GAGCTTGCAG TGATAAATCC AGGCCTGGCC TTAGGTCCCT TGA 
643 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

GAATTCGGCA CGAGAATTTT TCTGTGGTAA GCATATCTAT GGCTCAAACC AGAGAGAAGG 

6 ACGATGTCAG CATAACAAAC TCCAAAGGAT TGGTATGCGT GACAGGAGCG GCTGGTTACT 

^GGCATCTTG GCTTATCAAG CGTCTCCTCC AGTGTGGTTA CCAAGTGAGA GGAACTGTGC 

^ GGGATCCTGG CAATGAGAAA AAGATGGCTC ATTTATGGAA GTTAGATGGG GCGAAAGAGA 

^ GACTGCAACT AATGAAAGCT GATTTAATGG ACGAGGGCAG CTTCGATGAG GTCATCAGAG 

3 GCTGCCATGG TGTTTTTCAC ACAGCGTCTC CAGTCGTGGG TGTCAAATCA GATCCCAAGA 

3 TATGGTATGC TCTGGCCAAG ACTTTAGCAG AAAAAGCAGC ATGGGATTTT GCCCAAGAAA 
420 

ACCATCTGGA CATGGTTGCA G 
441 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 913 base pairs 
O) TYPE: nucleic acid. 
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iC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



( XI ) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


6i : 




GAATTCGGv-A 
60 

CAGGATCCTG 
120 

AATCCTCCGG 
180 

TGGCTCATCA 
240 

GGTAATCC* j 
300 


CGAGoAAAAC 


ATCATCCAGG 


CATTTTGGAA 


ATTTAGCTCG 


CCGGTTGATT 


CAATGGCTTT 


TGGCGAAGAG 


CAGACTGCCT 


TGCCACAAGA 


AACGCCTTTG 


TCCATCGAGG 


AACAGTGTGC 


GTTACAGGAG 


CTGCTGGGTT 


CATAGGGTCA 


TGCGATTGCT 


TGAGCGAGGA 


TATAGTGTTA 


GAGCAACTGT 


GCGAGACACT 


T* A A A f* A * t» ft 

i AAAGACrtAA 


GCATCTGTTG 


GATCTGCCGG 


GGGCAAATGA 


GAGATTGACT 


CTCTGGAAAG 
360 


CAGATTTGGA 


TGATGAAGGA 


AGCTTTGATG 


CTGCCATTGA 


TGGGTGTGAG 


GGTGTTTTCC 
420 

AT T AAGC C AA 
480 

GTGAAGCGAG 
54 0 

ACACCAGGCA 
600 


ATGTTGCCAC 


TCCCATGGAT 


TTCGAGTCCG 


AGGATCCCGA 


GAATGAGATA 


CAATCAACGG 


GGTCTTGAAT 


GTTATGAGAT 


CGTGTGCAAA 


AGCCAAGTCC 


TTGTTTTCAC 


GTCATCTGCT 


GGGACTGTGA 


ATTTTACAGA 


7GATTTC Z AA 


AAGTTTTTGA 


CGAATCATGC 


TGGACCAACG 


TGGATCTTTG 


CAGAAAAGTT 


AAAATGACAG 
660 

TTTGCAGAGG 
720 

TTCATTATGC 
780 

GAACCCCACT 
840 

TCACATATCT 
900 


GATGGATGTA 


CTTTGTATCG 


AAGACATTAG 


CAGAGAAAGC 


TGCTTGGGAT 


AGAACAAGAT 


CGATCTCATT 


ACTGTTATCC 


CCACATTGGT 


CGTTGGACCA 


AGACCATGCC 


ACCGAGCATG 


ATCACAGCCT 


TGGCACTGTT 


AACGCGGAAT 


ACATGATACT 


GAGACAGGTA 


CAGCTGGTTC 


ACTTGGATGA 




TTGTATATGA 


ACATCCTGAA 


GCAAAGGGCA 


GATACATCTC 


TTCCACATGT 


GATGCTACCC 


ATT 
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•2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 680 base pairs 
(S) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

GAATTCGGCA CGAGATCAAT TTTTGCATAT TATTAAAAAG TAAGTGTATT CGTTCTCTAT 
60 

ATTGATCAGT CACAGAGTCA TGGCCAGTTG TGGTTCCGAG AAAGTAAGAG GGTTGAATGG 
120 

AGATGAAGCA TGCGAAGAGA ACAAGAGAGT GGTTTGTGTA ACTGGGGCAA ATGGGTACAT 
180 

CGGCTCTTGG CTGGTCATGA GATTACTGGA ACATGGCTAT TATGTTCATG GAACTGTTAG 
240 

GGACCCAGAA GACACAGGGA AGGTTGGGCA TTTGCTGCGG CTCCCAGGGG CAAGTGAGAA 
300 

GCTAAAGCT3 TTCAAGGCAG AGCTTAACGA CGAAATGGCC TTTGATGATG CTGTGAGCGG 
360 

TTGTCAAGGG GTTTTCCACG TTGCCAAGCC TGTTAATCTG GACTCAAACG CTCTT<~AGG~ 
420 

GGAGGTTGTT GGTCCTGCGG TGAGGGGAAC AGTAAATCTG CTTCGAGCCT GCGAACGAT r 
480 
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GGGCACTGTG AAACGAGTGA TACATACCTC GTCCGTTTCA GCAGTGAGAT 7CACTGGGAA 

5 ACCTGACCCC CCTGATACTG TGCTGGATGA ATCTCATTGG ACTTCGGTCG AGTATTGCAG 
600 

AAAGACAAAG ATGGTCGGAT GGATGTACTA CATCGCCAAC ACTTATGCAG AAGAGGGAGC 
660 

CCATAAGTTC GGATCAGAGA 
680 

(2) INFORMATION FOR SEQ ID MO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 492 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : s ingle 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
GAATTCGGCA CGAGGCTGGT TCAAGTGTCA GCCCAATGGC CTCCCCTACA GAGAATCCCC 
AGATTTCAGA AGAGCTGCTA AAT CAT GAGA TCCATCAAGG AAGTACAGTA 7G7GTGACAG 
1 GAGCTGCTGG CTTCATAGGA TCATGGCTCG TCATGCGTTT GCTTGAGCGA GGATATACTG 
^ TTAGAGGAAC TGTGCGAGAC ACTGGTAATC CGGTGAAGAC GAAGCATCTA TTGGATCTGC 
2 CTGGGGCGAA TGAGAGGTTA ACTCTCTGGA AAGCAGATTT GGATGATGAA GGAAGCTTTG 

3 ACGCCGCCAT TGATGGTTGT GAGGGAGTTT TCCATGTTGC CACTCCCATG GATTTTGAAT 

3 60 

CCGAGGACCC cgagaacgag ataattaaac ccgctgtcaa tgggatgttg aatgttttga 

4 20 

GATCGTGTGG GAAAACCAAG TCTATGAAGC GAGTTGTTTT CACGTCGTCT GCTGGGACTC 
480 

TGCTTTTTAC GG 
492 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 524 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

GAATTCGGCA CGAGCTTGTT CAAAGTCACA TATCTTATTT TCTTTGTGAT ATCTGCAATT 

6 TCCAAGCTTT TCGTCTACCT CCCTGAAAAG ATGAGCGAGG TATGCGTGAC AGGAGGCACA 
120 

GGCTTCATAG CTGCTTATCT CATTCGTAGT CTTCTCCAGA AAGGTTACAG AGTTCGCACT 

180 

ACAGTTCGCA ACCCAGATAA TGTGGAGAAG TTTAGTTATC TGTGGGATCT GCCTGGTGCA 

2 AACGAAAGAC TCAACATCGT GAGAGCAGAT TTGCTAGAGG AAGGCAGTTT TGATGCAGCA 

3 GTAGATGGTG TAGATGGAGT ATTCCATACT GCATCACCTG TCTTAGTCCC ATATAACGAG 

3 CGCTTGAAGG AAACCCTAAT AGATCCTTGT GTGAAGGGCA CTATCAATGT CCTCAGGTCC 

4 TGTTCAAGAT CACCTTCAGT AAAGCGGGTG GTGCTTACAT CCTCCTCCTC ATCAATACCG 
480 
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ATACGACTA7 AATAGCTTA.G AGCGTTCCCT GCTGGACTGA GTCA 
524 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 417 base pairs 
C3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 65: 

TCCTAATTGT TCGATCCTCC CTTTTAAAGC CC7TCCC7GG CCTTCATTCC AGGTCACAGA 
60 

GTTGTTCATG CAGTGCTAGC AGGAGGAGCA GCGTTGCAAT TGGGGAAAAT TCCAAAATCA 
120 

ATAACGAGAG GACAGAAGTA AGTTTGTGGA AATAGCAACC ATGCCGGTGT TTCCTTCTGG 
180 

TCTGGACCCC TCTGAGGACA ATGGCAAGCT CGTTTGTGTC ATGGATGCGT CCAGTTATGT 
240 

AGGTTTGTGG ATTGTTCAGG GCCTTCTTCA ACGAGGCTAT TCAGTGCATG CCACGG7GCA 
200 

GAGAGACGCT GGCGAGGTTG AGTCTCTCAG AAAATTGCAT GGGGATCGAT 7GCAGATCTT 
360 

CTATGCAGAT GTCTTGGATT ATCACAGCAT TACTGATGCG CTCAAGGGCT GTTCTGG 
417 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 511 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{0) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

ATGACACGAA TTTGTGCCTC TCTCTGACCA GAGCTTGAAG CTCTGTCTTC 7CTGATATCG 
60 

CTTCATTCCA TCATCCAGGA GCTTCTGTTA TATCCATTTC CTCAAAATGG ATGCCTACCT 
120 

TGAAGAAAAT GGATACGGCG CTTCCAATTC TCGGAAATTA ATGTGCCTTA CCGGGGGCTG 
180 

GAGTTTCCTG GGGATTCATA TCGCAAGAAT GCTGCTCGGC CGGGGTTACT CAGTCCGTTT 
240 

CGCAATTCCG GTAACGCCAG AAGAGGCAGG CTCACTTATG GAATCCGAAG AAGCATTATC 
300 

GGGGAAGCTG GAGATATGCC AAGCCGATCT CTTGGATTAT CGCAGCGTTT TCGGCAACAT 
360 

CAATGGTTGC TCCGGAGTCT TCCACGTCCC TGCGCCCTGT GATCATCTGG ATGGATTACA 
420 

GGAGTATCCG GTATGATTAG TTTAATAGAT TGACGGGGTA TCCTGTATGA ATTAGTTTAT 
480 

GAATTTAAGG TTTTCTTAGA ATTTGGATAC T 
511 

(2) INFORMATION FOR SEQ ID NO: 67: 

{ii SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 609 base pairs 
■;B) TYPE: nucleic acid 
•:C) STRANDEDNESS: single 
:D) TOPOLOGY: linear 
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(xi; SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

CATTGATAG7 TGATGGAAGA CCATCAGTAA AGCATGAAAA AGAAATTGTT CCAAGGTGAA 
fi o 

GAAGTCAGT? GCTCCAGCAG AACCTTTTTA GCAATTGTTT TTGTATCCTT TTTGCCTTTG 
^AATATGTAA? CCATAAACT7 ATGCAGGAAG TGCCTCGTGC CGAATTCGGC ACGAGAATCA 
1 CTGACCTTCA CATATTTATT CCAATTCTAA TATCTCTACT CGCTGTCTAC CTGATTTTTC 
^ AGTGGCGAAC CAACTTGACA GGGTTGGACA TGGCCAACAG CAGCAAGATT CTGATTATTG 
3 GAGGAACAGG CTACATTGGT CGTCATATAA CCAAAGCCAG CCTTGCTCTT GGTCATCCCA 
3 CATTCCTTCT TGTCAGAGAG ACCTCCGCTT CTAATCCTGA GAAGGCTAAG CTTCTGGAAT 
4 CCTTCAAGGC CTCAGGTGCT ATTATACTCC ATGGATCTTT GGAGGACCAT GCAAGTCTTG 
4 TGGAGGCAAT CAAGAAAGTT GATGTAGTTA TCTCGGCTGT CAAGGGACCA CAGCTGACGG 

5 TTCAAACAGG ATATTTATCC AGGGTATTTA AAGGGAGGGT TGGAACCCAT CAAGAAGGGT 

600 

TTTGGCCAA 
609 

;2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 
tA) LENGTH: 474 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(XI; SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

GCAAGATAGG TTTTATTCTT CTGGAGTTGG GTGAGGCTTG GAAATTTAAG TAAAAAGGGT 

6 GCATAGCAA7 TAAGCAGTTG CAGCCATGGC GGTCTGTGGA ACTGAAGTAG CTCATACTGT 

X G?TCTATGTA GCTGCAGACA TGGTGGAAAA CAACACGTCT ATTGTGACCA CCTCTATGGC 

^GCAGCAAA? TGTGAGATGG AGAAGCCTCT TCTAAATTCC TCTGCCACCT CAAGAATACT 

2 GGTGATGGGA GCCACAGGTT ACATTGGCCG TTTTGTTGCC CAAGAAGCTG TTGCTGCTGG 

3 ?CATCCTACG TATGCTCTTA TACGCCCGTT TGCTGCTTGT GACCTGGCCA AAGCACAGCG 

3 c£tccaacaa TTGAAGGATG CCGGGGTCCA TATCCTTTAT gggtctttga GTGATCACAA 

"cTCTTAGTA AATACATTGA AGGACATGGG CCGTTGTTAT CTCTACCATT GGAG 
474 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 474 base pairs 
;5) TYPE: nucleic acid 
[Z) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 

(xi; SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

60 



BNSDOC1D <WO 981 1 205A2_I_> 



WO 98/11205 



PCT/NZ97/00112 



GCAAGATAGG TTTTATTCTT CTGGAG7T3G GTGAGGC7TG GAAATTTAAG TAAAAAGGGT 
60 

GCATAGCAAT TAAGCAGTTG CAGCCA7GGC GG7C7GTGGA ACTGAAGTAG CTCATACTGT 
120 

GCTCTATG7A GCTGCAGACA TGGTGGAAAA CAACACGTCT ATTGTGACCA CCTCTATGGC 
180 

TGCAGCAAA7 TGTGAGATGG AGAAGCC7CT TC7AAA77CC 7C7GCCACC7 CAAGAA7AC7 
240 

GGTGA7GGGA GCCACAGG7T ACATTGGCCG 7777G77GCC CAAGAAGCTG 77GCTGCTGG 
300 

7CA7CC7ACC 7A7GC7CTTA TACGCCCG7T 7GC7GC77G7 GACCTGGCCA AAGCACAGCG 
360 

CGTCCAACAA 77GAAGGA7G CCGGGG7CCA 7A7CC777A7 GGG7C777GA GTGA7CACAA 
420 

CC7C77AGTA AA7ACA7TGA AGGACA7GGG CCG77G77A7 C7C7ACCATT GGAG 
4"7 4 

(2) INFORMATION FOR SEQ ID MO:70: 

{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 608 base cairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

CATTGATAGT TGATGGAAGA CCA7CAG7AA AGCATGAAAA AGAAATTG7T CCAAGGTGAA 
60 

GAAGTCAG77 GCTCCAGCAG AACC7T777A GCAA77GTTT T7GTA7CCT7 TTTGCCTTTG 
120 

AATATGTAA7 CCATAAACT7 ATGCAGGAAG TGCC7CGTGC CGAATTCGGC ACGAGAATCA 
180 

CTGACCTTCA AATATTTATT CCAATTC7AA TATCTCTACT CGCTGTCTAC C7GAT7TTTC 
240 

AGTGGCGAAC CAACTTGACA GGGTTGGACA TGGCCAACAG CAGCAAGATT C7GATTATTG 
300 

GAGGAACAGG CTACATTGGT CGTCATA7AA CCAAAGCCAG CCTTGCTCT7 GG7CA7CCCA 
360 

CATTCC7TCT TGTCAGAGAG ACC7CCGC7T CTAA7CCTGA GAAGGCTAAG C7TCTGGAAT 
420 

CCT7CAAGGC C7CAGGTGC7 ATTATAC70C ATGGA7CTT7 GGAGGACCA7 CCAAGTCTTG 
480 

TGGAGGCAAT CAAGAAAG77 GATG7AG77A TCTCGGCTGT CAAGGGACCA CAGCTGACGG 
540 

ATCAAACAGG ATATTTATCC AGGGTA777A AAGGGAGGTT GGAACCCATC AAGAAGGGTT 
600 

TTGGCCAA 
608 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1474 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: singie 
(D> TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

GAATTCGGCA CGAGAAAACG TCCATAGC7T CCTTGCCAAC TGCAAGCAAT ACAGTACAAG 
60 

AGCCAGACGA TCGAATCCTG TGAAGTGG7T CTGAAGTGAT GGGAAGCTTG GAATCTGAAA 
120 



61 



WO 9WU2®5 



PCT/NZ97/00112 



AAACTGTTAC AGGATATCCA GCTCGGGACT CCAGTGGCCA C77GTCCCCT TACACTTACA 
190 

ATCTCAGAAA GAAAGGACCT GAGGATGTAA TTGTAAAGGT CATTTACTGC GGAATCTGCC 
240 

ACTCTGATT7 AGTTCAAATG CGTAATGAAA TGGACATGTC T CAT T AC CCA ATGGTCCCTG 
300 

GGCATGAAGT GGTGGGGATT GTAACAGAGA TTGGCAGCGA GGTGAAGAAA T7CAAAGTGG 
360 

GAGAGCA7GT AGGGGTTGGT TGCATTGTTG GGTCCTGTCG CAGTTGCGGT AATTGCAATC 
420 

AGAGCATGGA ACAATACTGC AGCAAGAGGA TTTGGACCTA CAATGATGTG AACCATGACG 
480 

GCACACCTAC TCAGGGCGGA TTTGCAAGCA GTATGGTGGT TGATCAGATG TTTGTGGTTC 
540 

GAATCCCGGA GAATCTTCCT CTGGAACAAG CGGCCCCTCT GTTATGTGCA GGGGTTACAG 

TTTTCAGCCC AATGAAGCAT TTCGCCATGA CAGAGCCCGG GAAGAAATGT GGGATTTTGG 
660 

GTTTAGGAGG CGTGGGGCAC ATGGGTGTCA AGATTGCCAA AGCCTTTGGA CTCCACGTGA 
720 

CGGTTATCAG TTCGTCTGAT AAAAAGAAAG AAGAAGCCAT GGAAGTCCTC GGCGCCGATG 
780 

CTTATCTTG? TAGCAAGGAT ACTGAAAAGA TGATGGAAGC AG C AG AG AG C C TAG ATT AC A 

8 TAATGGACAC CATTCCAGTT GCTCATCCTC TGGAACCATA TCTTGCCCTT CTGAAGACAA 

9 ATGGAAAGCT AGTGATGCTG GGCGTTGTTC CAGAGCCGTT GCACT7CGTG ACTCCTCTCT 
960 

TAATACTTGG GAGAAGGAGC ATAGCTGGAA GTTTCATTGG CAGCATGGAG GAAACACAGG 

1 AAACTCTAGA TTTCTGTGCA GAGAAGAAGG TATCATCGAT GATTGAGGTT GTGGGCCTGG 

1 ACTACATCAA CACGGCCATG GAAAGGTTGG AGAAGAACGA TGTCCGTTAC AGATTTGTGG 

^TGGATGTTGC TAGAAGCAAG TTGGATAATT AGTCTGCAAT CAATCAATCA GATCAATGCC 

l TGCATGCAAG ATGAATAGAT CTGGACTAGT AGCTTAACAT GAAAGGGAAA TTAAATTTTT 

^ ATTT AGG AAC TCGATACTGG TTTTTGTTAC TTTAGTTTAG CTTTTGTGAG GTTGAAACAA 
1320 

TTCAGATGTT TTTTTAACTT GTATATGTAA AGATCAATTT CTCGTGACAG TAAATAATAA 

1 TCCAATGTCT TCTGCCAAAT TAATATATGT ATTCGTATTT TTATATGAAA AAAAAAAAAA 
1440 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 
1474 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1038 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

GAATTCGGCA CGAGAGAGGG TTATATATCT TGATTCTGAC CTGATTGTCG TCGACGACAT 

6 TGCCAAGCTC TGGGCCACGG ATTTGGAATC TCGTGTCCTC GGGGCACCAG AGTACTGCAA 
120 

GGCGAATTTC ACAAAGTATT TCACCGATAA TTTCTGGTGG GATCCCGCAT TATCCAAGAC 
180 

CTTTGAGGGA AAAAAACCCT GCTACTTCAA CACAGGCGTA ATGGTGATCG ATCTTGAAAA 
240 
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ATGGCGGGCA GGGGAATTCA 
300 

CCGTATCTAT GAGCTCGGAT 
360 

GCAAGTCGAT CATCGTTGGA 
420 

CCGAGATCTT CACCCTGGAC 
480 

GCTACGCCTG GAATGCCAAG 
540 

TTTATCGATC AACGTATTAC 
600 

ATCGAATTAA ACCTGATTTG 
660 

GTTTTGAATT TCAATTCTGG 
720 

CAAATCCA7C ATGAGGGACC 
780 

CGCCTGTGAA GAATGATATT 
840 

CAGCCAGCAG AGAGGCAAGC 
900 

AATTTTCGGC G ACT G 7 AC AG 
960 

CCTGAACCAA CAACTGTATA 
1020 

AAAAAAAAAA AAAAAAAA 
1038 

(2) IN FORMAT 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS: single 
(0) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

CTAGGGGTCT TGGGGGGTTC CTGATGCCCA ATTGT7GCTG TGCTTGGCAT GAACCCAAAA 
60 

CATGCAAGAG ATCTGTAGTC AGTAGTCTTG TTGGATCTAT AGCTTT7AGA AAAG AG TC AC 
120 

GTCCTTTTAG GGTAACATCA TTCCAACCAT ATCCAGTTCC ACCACCGGCT ACACCTTCAA 
180 

CGGGAGGAGG AGCAAGATAT TCAGCATTGC TTTGGGCACC AGATGGATAG GCATTATTTT 
240 

CCATCGGAAT TCAGCCGAGC TCGCCCCCTC AGTCCAATCG TCGTGAAAAT CCCTCAAAAT 
300 

TGGGCAATTC TGGCTCGAAA TCGCCAAATT ATGGGCTACA ACAGGATTAA AATTGCACAG 
360 

AAATCTGCCA GT 
372 

(2) INFORMATION FOR SEQ ID NO:74: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 545 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID MO: 74: 
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CAAGAAAGAT CGAAATCTGG 
CATTACCGCC ATTTTTACTG 
ATCAGCACGG TTTAGGCGGA 
CTGTCAGTTT GTTGCAT7GG 
CGGACTTGCC CTCTGGATAC 
CTAAATGGGT GAGAGAGCCT 
ATAAAATGCC AAATAGAACT 
TAACGAATAG AAGAAAACAA 
AATCGTTTGA ATTTAGTATT 
GTGGACTGAT CTATTTATAT 
AATGCCGCTG CAAGTCATGT 
GATGTAAATT TTTGGAACAT 
ATACCTTATA AATGTATCTG 

ON FOR SEQ ID NO: 73: 



ATGGACATAC 


AGAAGGAACG 


GTATTTGCTG 


3TTTGGTTAA 


GATAATTTGC 


AAGGCCTTTG 


AGTGGTAAGG 


GCAAACCTTG 


TTTATGGGCT 


CCTTATGATC 


CTCTCCTCGG 


GGTGCTTTTT 


TTACGCCTAT 


GCATCTTTCA 


TAGCACAGCC 


ACAGGCAGGA 


AAT AAG G T T G 


"7CCATATAA 


TTGTACTGCC 


ATGCCATCCT 


AGGGAAGGCG 


7TGTGAACTC 


TAATATCATT 


AT GAT AAG 7 7 


CAACTCCATT 


77TGCATAAA 
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AAAGAATTCG GCACGAGGGC AATCCGAGCC TAGCCAACCA ACTTGGCAGC AAGGAGCACA 
60 

GGGAGTTGGC GAGAGAAGCT GTTAGGAAAT CTTTGGTATT GTTGAAAAAT GGGAAGTCAG 
120 

CCAACAAGCC TTTGCTCCCT TTGGAGAAGA ATGCTTCCAA GGTTCTTGTT GCAGGAACCC 
180 

ATCCTGATAA TCTGGGTTAT CAGTGTGGTG GATGGACGAT GGAATGGCAA GGATTAAGTG 
240 

GAAACATAAC CGTAGGAACT ACAATTCTGG AAGCTATCAA ACTAGCTGTC AGCCCCTCTA 
300 

CTGAAGTGGT TTATGAGCAA AATCCAGATG CTAACTATGT CAAAGGACAA GGGT7T7CAT 
360 

ATGCCATTGT GGTTGTGGGT GAGGCACCAT ACGCAGAAAC GTTTGGAGAC .-ATCTTAATT 
420 

TGACCATTCC CCTAGGCGGA GGGGACACGA TTAAGACGGT C7GTGGCTCC 7TGAAATGCC 
480 

TTGTAATCTT GATATCTGGA AGGCCACTTG TTATTGAACC TTATCTTCCA TTSGTGGATC 
540 

GTTTT 
545 

(2) INFORMATION FOR SEQ ZD NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

GCAGGTCGAC ' ACTAGTGGAT CCAAAGAATT CGGCACGAGA AAAAACAAAT GTTAGCTAGC 
60 

CTAGTGATGA GCTTTACGTA TACCTGGCCT TTTATACATG GATCTGAGTT TTTATGCAGG 
120 

TGTAGAGCCT TTTGTTACTC TGTATCACTG GGACTTGCCA CAAGCTCTGG AGGACGAATA 
180 

CGGTGGATTT CGTAGCAAAA AAGTTGTGGA TGACTTTGGC ATATTCTCAG AAGAATGCTi 
240 

TCGTGCTTTT GGAGACCGTG TGAAGTACTG GGTAACTGTT AACGAACCGT TGATCTTCTC 
300 

ATATTTTTCT TACGATGTGG GGCTTCACGC ACCGGGCCGC TGTTCGCCTG GATTTGGAAA 
360 

CTGCACTGCG GGAAATTCAG CGACAGAGCC TTATATTGTA GCCCATAACA TGCTTCTTGC 
420 

ACATAGTACC GCTGTTAAAA ATATATAGCA TAAATACCCA GGG 
4 63 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

ACACTAGTGG ATCCAAAGAA TTCGGCACGA GGCTACCATC TTCCCTCATA ATATTGGGCT 
60 

TGGAGCTACC AGGGATCCTG ATCTGGCTAG AAGAATAGGG GCTGCTACGG CTTTGGAAGT 
120 

TCGAGCTACT GGCATTCAAT ACACATTTGC TCCATGTGTT GCTGTTTGCA GAGATCCTCG 
180 
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ATGGGGCCGC TGCTATGAGA GCTACAGTGA GGATCCAAAA AT7GTCAAGG "ATGAC73A 
240 

GATTATCGT7 GGCCTGCAAG GGAATCCTCC TGCTAATTCT ACAAAAGGGG GGCCTTT7AT 
300 

AGCTGGACAG TCAAATGT7G CAGCTTGTGC TAAGCATTTT GTGGGTTATG 3 7 GG AACAAC 
360 

CAAAGGTA7C GATGAGAA7A ATACTGT7A7 CAACTA7CAA GGG7TA7T7C AACA7TCCAA 
420 

ATTACCCCCA ATT77 
435 

\2) INFORMATION FOR SEQ ID NO : 7 7 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 5i base pairs 
(3) TYPE: nucieic acid 
;C) STRANDEDNES3: single 
(D) TOPOLOGY: linear 



(xi: SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

GAATTCGGCA CGAGCC7AGA ATTCTATGGT G AAAAT 7 G 7 7 GGGACAAGGC 7GCCCAAG77 
60 

7ACAAAGGAA CAGTCCCAAA 7GGTTAAAGG TTCAATAGAC TATCTAGGCG 77AACCAA7A 
120 

CACTGCTTA7 TACATGTA7G ATCCTAAACA ACC7AAACAA AATG7AACAG A7TACCAGAC 
180 

TGGACTGGAA TACAGGC777 GCATATGCTC GCAATGGAGT GCCTATTGGA CCAAGGGCGA 
240 

ACTCCAAT7G GCTTTACATT GTGCCTTGGG GTCTATACAA GGCCGTCACA 7ACGTAAAAG 
300 

AACACTATGG AAATCCAACT ATGATTC7CT CTGAAAA7GG AATGGACGAC CTGGAAACGT 
360 

GACACTTCCA GCAGGACTGC ATGATACCAT CAGGGG7AAC TACTATAAAA GCTATTTGCA 
420 

AAATTTGA77 AATGCACG7G AATGACCGGG G 
451 

[2) INFORMATION FOR SEQ ID NO: 76: 

( i > SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 374 base pairs 
(3) TYPE: nucleic acid 
iC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

CTGCTCTGCA AGCAGTACTA TGCACAGCAA GGCCTGCTTA ACTGAAAACA GAGCGCTGAG 
60 

CTTGAGGAAA CGCTCAAGCA TTGCTGAGGC CACCGTTTAT C7AAATAGCG CAACATAGGG 
120 

CTTCAGAAAA ATGGCAATGG CACAAGCATT CAGAGGCCGT GTCTTGCAAG CTGCCCGTTT 
180 

GCTCCGCCGC AACATTC7GC CGGAGGATAA AAGCTTTGGA TCCGCTGCTT C7CCTAGACG 
240 

AGCTCTTAGC CTGCTCTCAT CAAAAGCCTT CATCTCTTTC TCTGTTGAAC 3GCATCGGCT 
300 

AGCTGCTACA AATTCAACAA TTGTGTTGCA ATCTCGAAAC TT7TCTGCAA AAGGTAAAAA 
360 

G AC AG G AC AA TCTG 
374 

,'2) INFORMATION FOR SEQ ID NO : 7 9 : 
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(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 57 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

GAAGAATGGA AGAGATTAAT GGTGATAACG CAGTAAGGAG GAGCTGCTTT CCTCCAGGTT 

6 ?CATGTTTGG GATAGCAACT TCTGCTTATC AGTGTGAAGG AGCTGCCAAC GAAGGTGGAA 

^GGCCCAAG CATCTGGGAC TCATTTTCAC GAACACCAGG CAAAATTCTT GATGGAAGCA 

^ ACGGTGATG7 AGCAGTGGAT CAGTATCATC GTTATAAGGC AG AT GT AAAA CTGATGAAAG 

"tATGGGCGT GGCTACCTAC AGATTC7CGA TTTCATGGCC TCGTATATTT CCAAAGGGAA 

^ AAGG AG AG AT CAATGAGGAA GGAGTAGCCT ATTACAATAA CCTCATCAAT GAACTCCTCC 

^ AGAATGGAA? CCAAGCGTCT GTCAACTTTG TTTCACTGGG ATACTCCCCA GTCTCTGGAG 
420 

GATGAATA7G GCGGATTTCT GAGGCCAACC ATTGTGA 
457 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i; SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 346 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO:80: 




GGTGTGATGG 


CAGGAATTCC 


AGTCCTAAGG 


CCATTTTGCA 


TCTGTTTGCT 


TTCAGTCTAC 


60 

ATGCTGCACA 


TTGTAGCTGC 


AGTAGCTTCA 


CCAAGGCTAG 


GTAGAAGCAG 


CTTCCCAAGG 


120 
GGTTTCAAAT 


TTGGTGCAGG 


GTCATCTGCT 


TATCAGGCGG 


AAGGAGCTGC 


TCATGAGGGT 


180 
GGCAAAGGCC 


CAAGCATTTG 


GGATACATTC 


TCCCACACTC 


CAGGTAAAAT 


CGCTGATGGG 


240 
AATATTGGGA 


TGTTGCAGTA 


GATCAATACC 


ACCGTTATAA 


GGAAGATGTG 


CAGCTTCTCA 


300 

AATACATGGG 
346 


AATGGACGTC 


TATCGTTTCT 


CTATCTCCTG 


GTCACG 





(2) INFORMATION FOR SEQ ID NO:81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 957 base pairs 

(B) TYPE: nucleic acid 
.;C) STRANDEDNESS: single 
/ D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 81: 
GAATTCGGCA CGAGAAAGCC CTAGAATTTT TTCAGCATGC TATCACAGCC CCAGCGACAA 
CTTTAACTGC AATAACTGTG GAAGCGTACA AAAAGTTTGT CCTAGTTTCT CTCATTCAGA 



60 
C 
120 
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CTGGTCAGG7 TCCAGCATT7 CCAAAATACA CACCTGCTGT 7G7CCAAAGA AATTTGAAAT 
180 

CTTGCACTCA GCCCTACAT7 GATTTAGCAA ACAACTACAG TAGTGGGAAA ATTTCTGTAT 
240 

TGGAAGCT73 TGTCAACACG AACACAGAGA AGTTCAAGAA TGATAGTAAT TTGGGGT7AG 
300 

TCAAGCAAGT T7TGTCATCT CTTTATAAAC GGAATATTCA GAGATTGACA CAGACATATC 
360 

TGACCCTCTC TCTTCAAGAC ATAGCAAGTA CGGTACAGTT GGAGACTGCT AAGCAGGCTG 
420 

AACTCCATGT TCTGCAGATG ATTCAAGATG G7GAGA77TT TGCAACCATA AATCAGAAAG 
480 

ATGGGATGGT GAGCTTCAAT GAGGATCCTG AACAGTACAA AACATGTCAG ATGACTGAAT 
540 

ATATAGATAC TGCAATTCGG AG AAT CAT GG CACTATCAAA GAAGCTCACC AC AG T AG AT G 
600 

AGCAGATTTC GTGTGATCAT TCC7ACCTGA GTAAGGTGGG GAGAGAGCGT TCAAGA7TTG 
660 

ACA7AGA7GA TTTTGATACT GTTCCCCAGA AGTTCACAAA TATGTAACAA ATGATGTAAA 
720 

TCATCTTCAA GACTCGCTTA TATTCATTAC TTTCTATGTG AATTGATAGT CTGTTAACAA 
780 

TAGTACTG7G GCTGAGTCCA GAAAGGATCT CTCGGTATTA TCACTTGACA TGC CATC AAA 
840 

AAAATCTCAA ATTTCTCGAT GTCTAGTCTT GATTTTGATT ATGAATGCGA CTTTTAGT7G 
900 

TGACATTTGA GCACCTCGAG TGAACTACAA AGTTGCATGT TAAAAAAAAA AAAAAAA 
957 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 489 base pairs 
(3) TYPE: nucleic acid 
(CI STRANDEDNESS: single 
ID) TOPOLOGY: linear 



(xil SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

GCAGGTCGAC ACTAGTGGAT CCAAAGAATT CGGCACGAGA TAAGACTAAT TTTCCAGACA 
60 

ATCCTCCA7T CCCATTCAAT TACACTGGTA CTCCACCCAA TAATACACAG GCTGTGAATG 
120 

GGACTAGAGT AAAAGTCCTT CCCTTTAACA CAACTGTTCA ATTGA7TCTT CAAGACACCA 
180 

GCATCT7CAG CACAGACAGC CACCC7G7CC ATCTCCATGG T77CAA7TTC TTTGTGGTGG 
240 

GCCAAGGTGT TGGAAACTAC AATGAATCAA CAGATGCACC AAATT7TAAC CTCATTGACC 
300 

CTG7CGAGAG AAACACTGTG GGAGTTCCCA AAGGAGGTTG GGCTGCTATA AGATTTCGTG 
360 

CAGACAATCC AGGGGTTTGG TTCATGCACT GTCATTTGGA GGTTCACACA TCGTGGGGAC 
420 

TGAAAATGGC GTGGGTAGTA AAGAACGGAA AAGGGCCCAT CGATT7TCCA CCCGGGTGGG 
480 

TACCAGTAA 
489 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i; SEQUENCE CHARACTERIS7ICS : 
{A) LENGTH: 471 base pairs 
(B) TYPE: nucleic acid 
;C) STRANDEDNESS: single 
;D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

GAATTCGGCA CGAGAAAACC TTTTCAGACG AATGTTCTGA TGCTCGGCCC CGGCCAGACA 

6 ACAGACATAC TTCTCACTGC CAATCAGGCT ACAGGTAGAT ACTACATGGC TGCTCGAGCA 

^ TATTCCAACG GGCAAGGAGT TCCCTTCGAT AACACCACTA CCACTGCCAT TTTAGAATAC 

'gagggaagct CTAAGACTTC AACTCCAGTC ATGCCTAATC TTCCATTCTA 7AACGACACC 

2 AACAGTGCTA CTAGCTTCGC TAATGGTCTT AGAAGCTTGG GCTCACACGA CCACCCAGTC 

3 ?TCG7TCCTC AGAGTGTGGA GGAGAATCTG TTCTACACCA TCGGTTTGGG GTTGATCAAA 

^GTCCGGGGC AGTCTTGTGG AGGTCCAACG GATCAAGATT TGCAGCAAGT ATGAATACAT 

"atcatttgtc CCGCAACCAC 7TCTTCCAAT CCTTCAAGCT CAGCATTTTG g 
471 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

GTTCGGCACT GAGAGATCCA TTTCTTTCAA TGTTGAGACA GTGAGTAGTA TTAGTTTGAT 

6 ATCTCTTTCA GGAATATATC GTGCTTGCAG GATCTTTAGT TTCTGCAACA ATGTCGTTGC 

'aatcagtgcg TCTATCTTCT GCTCTCCTTG TTTTGCTACT AGCATTTGTT GCTTACTTAG 

^ TTGCTGTAAC AAACGCAGAT GTCCACAATT ATACCTTCAT TATTAGAAAG AGACAGTTAC 

2 CAGGCTATGC AATAAGCGTA TAATCGCCAC CGTCAATGGC AGCTACCAGG CCCAACTATT 

3 CATGTACGTG ATGGAGACGT TGTTAATTAT CAAAGCTT 
338 

;2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1229 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 85: 
AGAGAAATAA TTATATTTGT AAATTTAAGT CTACGTTTAT TAAAAAACTA CAACCCTAAA 
6 TGCAGGAGAA AAAACAAGCA TGCTGTCTAC TGAAGCTTAC AAATCAAATC CCTGCGATAT 
^ GTCTTTTCT C GTGCCGAATT CGGCACGAGA AGATCTTGGT TCGAGTCTCT CAGCTCTCTC 
^AAAGGAATT TTGTGGGTCA TTTGCAGGTG AAGACACCAT GGTGAAGGCT TATCCCACCG 
2 TAAGCGAGGA GTACAAGGCT GCCATTGACA AATGCAAGAG GAAGCTCCGA GCTCTCATTG 
300 
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CAGAGAAGAA CTGTGCGCCG ATCATGG77C GAATCGCATG GCACAGCGCT GGGACTTACG 
360 

ATGTCAAGAC CAAGACCGGA GGGCCCTTCG GGACGATGAG ATATGGGGCC GAGCTTGCCC 
420 

ACGGTGCTAA CAGTGGTCTG GACATCGCAG TTAGGCTCCT GGAGCCAATC AAGGAACAGT 
480 

TCCCCATAAT CACCTATGCT GACCTTTA7C AGTTGGCTGG TGTGGTGGCT GTTGAAGTGA 
540 

CCGGGGGACC TGACATTCCG TTCCATCCTG GAAGAGAAGA CAAGCCTGAG CCTCCAGAAG 
600 

AAGGCCGCCT TCCTGA7GCT ACAAAAGGAC CTGATCA7CT GAGGGATGTT TTTGGTCACA 
660 

TGGGGTTGAA 7GATAAGGAA A7TGTGGCC7 TGTCTGG7GC CCACACCT7G GGGAGATGCC 
720 

ACAAGGAGAG ATCTGGTTTT GAAGGACCA7 GGACCTC7AA CCCCCTTATC 7TTGACAACT 
780 

CTTACTTCAC AGAGC77GTG ACTGGAGAGA AGGAAGGCC7 GCTTCAGTTG CCATCTGATA 
840 

AGGCACTGCT TGCTGATCCT AG777TGCAG TTTATGTTCA GAAGTATGCA CAGGACGAAG 
900 

ACGC7TTCT7 TGCTGACTAT GCGGAAGCTC ACC7GAAGCT T7CTGAACTT GGGTTTGCTG 
960 

ATGCG7AGA7 7CA7ACC7TC TGCAGAGACA AT7CC77GC7 AGATAGCTTC G7TTTGTATT 
1020 

TCATCTAATC 777TCGA7TA 7A7AGTCACA TAGAAG77GG 7GT7A7GCGC CATAG7GA7A 
1080 

C77GAACC7A CA7GT7777G AAAAGTA7CG A7G77C777A AAA7GAACA7 TGAATACAAC 
1140 

A7TTTGGAAT CTGGT7GTGT TCTATCAAGC GCATATT77A ATCGAATGCT 7CGTTCCTGT 
1200 

TAAAAAAAAA AATAAAATAA AAAAAAAAA 

1229 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1410 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS : singie 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 

GAAGATGGGG CTGTGGGTGG TGCTGGC7TT GGCGCTCAGT GCGCACTATT GCAGTCTCAG 
60 

GCTTACAATG TGGTAAGTTC AAGCAATGCT ACTGGGAGTT ACAGTGAGAA TGGATTGGTG 
120 

ATGAATTACT ATGGGGACTC TTGCCCTCAG GCTGAAGAGA TCATTGCTGA ACAAGTACGC 
180 

CTGTTGTACA AAAGACACAA GAACACTGCA TTCTCATGGC TTAGAAATAT TTTCCATGAC 
240 

TGTGCTGTGG AGTCATGTGA TGCATCGCTT CTGTTGGACT CAACAAGGAA CAGCATATCA 
300 

GAAAAGGACA CTGACAGGAG CTTCGGCCTC CGCAACTTTA GGTATTTGGA TACCATCAAG 
360 

GAAGCCGTGG AGAGGGAGTG CCCCGGGGTC GTTTCCTGTG C AG AT AT ACT CGTTCTCTCT 
420 

GCCAGAGATG GCGTTG7ATC GTTGGGAGGA CCATACATTC CCCTGAAGAC GGGAAGAAGA 
480 

GATGGACGGA AGAGCAGAGC AGATGTGGTG GAGAATTACC 7GCCCGATCA CAATGAGAGC 
540 

ATCTCCACTG TTCTGTCTCG CTTCAAAGCC ATGGGAATCG ACACCCGTGG GGTTGTTGCA 
600 

CTGCTGGGGG CTCACAGCG7 GGGGAGGACT CACTGCGTGA AGCTGGTGCA CAGGCTGTAC 
660 
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CCGGAAGTAG ATCCGACACT GGACCCTGGG CACGTGGAGC ACATGAAGCA T.-_AGTGCCCo 
7 20 

GACGCGATCC CCAACCCGAA GGCAGTGCAG TATGTGCGGA ACGACCGGGG AACGCCTATG 
780 

AAGCTGGACA ACAACTACTA CGTGAACCTG ATGAACAACA AGGGGCTCCT AATAGTGGAC 

8 CAGCAACTGT ATGCAGATTC GAGGACCAGG CCGTATGTGA AGAAGATGGC AAAAAGCCAG 
900 

GAATACTTCT TCAAATACTT CTCCCGGGCG CTCACCATCC TCTCTGAGAA CAATCCTCTC 
960 

ACCGGCGCTC GAGGAGAAAT CCGTCGGCAG TGCTCGCTCA AAAACAAATT GCACACAAAA 
1020 

AGCAAGCGTT GAGCGATAGC TCAATGCCGC AGTGGTGGGA GTGATAGCGT G.-.TGCCr-.CAo 
1080 

TGGTGGGCAT TTCATATATA AATTGCAGTT TGCGTTTTTA TTAGATAATC ATAATGGTGT 
1140 

GGTGTGACTA TGCCCTGCGA ATCACATCGA TGAACCACAA CCGAACCGTG GAACAG7AGG 

^CTTATTCCCT TATGTAAGCA GAACCTTTTA TTATAAGCAA AAAAGACAAT CCTGTCTGTT 

^ ATTCTAGTAT AATTTTGTCA TCAGTTAAAG TTGCTCATCT GATAATAACT GGAAACGGTA 
1320 

AAATATGACA ACTACGTATC TTCTTTGGTC ATCTGATAAT AACCGGAAAC Z- AT AAAA T AT 
1380 

GACAACTACA TATATTCTTT AAAAAAAAAA 
1410 

;2) INFORMATION TOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 687 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

GTAGTTTCGT TTTACAACAA TCTCAGGTTT TGAATCTCAG AATAG7TGCG AAAGGAAGCG 
60 

ATGACGAAGT ACGTGATCGT TAGCTCCATT GTGTGTTTCT TT3TATTTG7 TTCTGCGTGC 

L AT AATTTC7G TCAATGGATT AGTTGTCCAT GAAGATGATC TGTCAAAGCC TG7GCATGGG 
180 

CTTTCGTGGA CATTTTATAA GGACAGTTGC CCCGACTTGG AGGCCATAGT GAAATCGGTA 
2 4 0 

CTTGAGCCGG CGTTGGACGA AGATATCACT CAGGCCGCAG GCTTGCTGAG ACTTCATTTC 

3 CATGACTGTT TTGTGCAGGG TTGCGATGGG TCCGTGTTGC TGACAGGAAC TAAAAGAAAC 

3 CCCAGTGAGC AACAGGCTCA GCCAAACTTA ACACTAAGAG CCCGGGCCTT GCAGCTGATC 

4 GACGAAATTA AAACCGCTGT AGAAGCTAGC TGCAGTGGGG TTGTAACTTG TGCAGACATT 

4 CTGGCTTTGG CTGCTCGTGA CTCCGTCCGC TCAGGAGGCC CAAAATTTCC AGTACCACTT 

5 GGCCGCAGAG ATAGCCTAAA GTTTGCCAGT CAATCCGTAG TTCTC3CCAA TATACCAACT 

6 CCAACTTTAA ATTTGACACA GCTGATGAAC ATTTTTGGCT CCAAAGGATT CAGTTTGGCC 
660 

GAAATGGTTG CTCTTCAGGT GGCACAC 
687 

(2) INFORMATION FOR SEQ ID NO: 88: 
(i) SEQUENCE CHARACTERISTICS: 
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■Ai LENGTH : 63 3 base pairs 

(3) TYPE: nucieic acid 

(C) STRANDEDNESS : single 

(C) TOPOLOGY: linear 



( xi ) SEQUENCE DESCRIPTION: SEC ID NO: 88: 

GTAGTTTCGT TTTACAACAA TCTACAGGTT TTGAATCTCA GAATAGTTGC GAAAGGAAGC 
60 

GATGACGAAG TACGTGATCG TTAGCTCCAT TG7ATGTTTC TTTGTATTTG 777CTC-CGTG 
120 

CATAAT7TCT GTCAATGGAT TAGT7GTCCA TGAA.GATGAT CTGTCAAAGC 77GTGCATGG 
180 

GCTTTCGTGG ACATTT7ATA AGGACAGTTG CCCCGACTTG GAGGCCATAG 7GAAA7C3G7 
240 

ACTTGAGCCG GCGTTGGACG AAG AT AT C AC TCAGGCCGCA GGTTGCTGAG ACT7CA777C 
300 

CATGACTGT7 TTGTGCAGGG TTGCGATGGG TCCGTGTTGC TGACAGGAAC 7AAAAGAAAC 
360 

CCCCGAGTGA GCAACAGGC7 CAGCCAAACT TAACACTAAG AGCCCGGGCC 77GCAGC7GA 
■?20 

TCGACGAAAT TAAAACCGC7 GTAGA.AGCTA GCTGCAGTGG GGTTGTAACT 7G7GCAGACA 

4 90 

TTC7GGC7T7 GGCTGCTCGT GACTCCGTCG C7CAGGAGGC CCAAAATTTC CAGTACCACT 

5 40 

TGGCCGCAGA GATAGCCTAA AGTTTGCCAG 7CAATCCGTA GTTCTCGCCA A7ATACCAAC 
600 

TCCAACTTTA AATTTGACAC AGCTGATGAA CATTTTTGGC TCCAAAGGAT 77AGT7TGGC 
660 

CGAAATGGT7 GCTCTTCAGG TGGCACAC 
688 
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Claims: 

1 . An isolated DNA sequence comprising a nucleotide sequence selected from the 
group consisting of 

(a) sequences recited in SEQ ID NO: 3, 1 3. 16-70. 72-88: 

(b) complements of the sequences recited in SEQ ID NO: 3. 13, 16-70, 72- 
88; 

(c) reverse complements of the sequences recited in SEQ ID NO: 3. 13. 16- 
70. 72-88; 

(d) reverse sequences of the sequences recited in SEQ ID NO: 3. 13. 16-70, 
72-88 and 

(e) sequences having at least about a 99% probability of being the same as a 
sequence of (a) - (d) as measured by computer algorithm FASTA. 

2. A DNA construct comprising a DNA sequence according to claim I . 

3. A transgenic cell comprising a DNA construct according to claim 2. 

4. A DNA construct comprising, in the 5"-3' direction: 

(a) a gene promoter sequence, 

(b) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3, 13. 16-70, 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO: 3, 13, 16-70, 72-88 as measured by computer 
algorithm FASTA; and 

(c) a gene termination sequence. 

5. The DNA construct of claim 4 wherein the open reading frame is in a sense 
orientation. 
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6. The DNA construct of claim 4 wherein the open reading frame is in an antisense 
orientation. 

7. The DNA construct of claim 4. wherein the gene promoter sequence and cene 
termination sequences are functional in a plant host. 

8. The DNA construct of claim 4, wherein the gene promoter sequence provides 
for transcription in xylem. 

9. The DNA construct of claim 4 further comprising a marker for identification of 
transformed cells. 

10. A DNA construct comprising, in the 5—3* direction: 

(a) a gene promoter sequence, 

(b) a non-coding region of a gene coding for an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3, 13, 16-70. 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3, 13, 16-70. 72-88 as measured by computer algorithm FASTA; and 

(c) a gene termination sequence. 

11. The DNA construct of claim 10 wherein the non-coding region is in a sense 
orientation. 

12. The DNA construct of claim 10 wherein the non-coding region is in an 
antisense orientation. 

13. The DNA construct of claim 10. wherein the gene promoter sequence and eene 
termination sequences are functional in a plant host. 

14. The DNA construct of claim 10, wherein the gene promoter sequence provides 
for transcription in xylem. 
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15. A transgenic plant cell comprising a DNA construct, the DNA construct 
comprising, in the 5*-3 % direction: 

(a) a gene promoter sequence; 

(b) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3. 13. 16-70, 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO: 3, 13, 16-70. 72-88 as measured by computer 
algorithm FASTA; and 

(c) a gene termination sequence. 

16. The transgenic plant cell of claim 15 wherein the open reading frame is in a 
sense orientation. 

17. The transgenic plant cell of claim 15 wherein the open reading frame is in an 
antisense orientation. 

18. The transgenic plant cell of claim 15 wherein the DNA construct further 
comprises a marker for identification of transformed cells. 

19. A plant comprising a transgenic plant cell according to claim 15. or fruit or 
seeds thereof. 

20. The plant of claim 1 9 wherein the plant is a woody plant. 

21. The plant of claim 20 wherein the plant is selected from the group consisting of 
eucalyptus and pine species. 

22. A transgenic plant cell comprising a DNA construct, the DNA construct 
comprising, in the 5 '-3' direction: 

(a) a gene promoter sequence; 
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(b) a non-coding region of a gene coding for an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3, 13, 16-70. 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3, 13 ? 16-70, 72-88 as measured by computer algorithm FASTA; and 

(c) a gene termination sequence. 

23. The transgenic plant cell of claim 22 wherein the non-coding region is in a sense 
orientation. 

24. The transgenic plant cell of claim 22 wherein the non-coding region is in an 
anti sense orientation. 

25. A plant comprising a transgenic plant cell according to claim 22. or fruit or 
seeds thereof. 

26. The plant of claim 25 wherein the plant is a woody plant. 

27. The plant of claim 26. wherein the plant is selected from the group consisting of 
eucalyptus and pine species. 

28. A method for modulating the lignin content of a plant comprising stably 
incorporating into the genome of the plant a DNA construct comprising, in the 
5' -3* direction: 

(a) a gene promoter sequence; 

(b) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3. 13. 16-70, 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO: 3. 13. 16-70. 72-88 as measured by computer 
algorithm FASTA: and 

(c) a gene termination sequence. 
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29. The method of claim 28 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

30. The method of claim 28 wherein the open reading frame is in a sense 
orientation. 

31. The method of claim 28 wherein the open reading frame is in an antisense 
orientation. 

32. A method for modulating the lignin content of a plant comprising stably 
incorporating into the genome of the plant a DNA construct comprising, in the 
5"-3* direction: 

(a) a gene promoter sequence: 

(b) a non-coding region of a gene coding for an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3. 13, 16-70. 72-88 and sequences having at least 

- ^bout a 99% probability of being the same as a sequence of SEQ ID NO: 

3. 13. 16-70. 72-88 as measured by computer algorithm FASTA: and 

(c) a gene termination sequence. 

33. The method of claim 32 wherein the non-coding region is in a sense orientation. 

34. The method of claim 32 wherein the non-coding region is in an antisense 
orientation. 

35. The method of claim 32 wherein the plant is a woody plant. 

36. The method of claim 35, wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

37. A method for producing a plant having altered lignin structure comprising: 
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(a) transforming a plant cell with a DNA construct comprising, in the 5*-3* 
direction, a gene promoter sequence, an open reading frame coding for at 
least a functional portion of an enzyme encoded by a nucleotide 
sequence selected from the group consisting of sequences recited in SEQ 
ID NO: 3, 13, 16-70, 72-88 and sequences having at least about a 99% 
probability of being the same as a sequence of SEQ ID NO: 3, 13, 16-70. 
72-88 as measured by computer algorithm FASTA. and a gene 
termination sequence to provide a transgenic cell; 

(b) cultivating the transgenic cell under conditions conducive to 
regeneration and mature plant growth. 

38. The method of claim 37 wherein the open reading frame is in a sense 
orientation. 

39. The method of claim 37 wherein the open reading frame is in an antisense 
orientation. 

40. The method of claim 37 wherein the plant is a woody plant. 

41. The method of claim 40 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

42. A method for producing a plant having altered lignin structure comprising: 

(a) transforming a plant cell with a DNA construct comprising, in the 5 '-3" 
direction, a gene promoter sequence, a non-coding region of a gene 
coding for an enzyme encoded by a nucleotide sequence selected from 
the group consisting of sequences recited in SEQ ID NO: 3, 13, 16-70. 
72-88 and sequences having at least about a 99% probability of being 
the same as a sequence of SEQ ID NO: 3, 13. 16-70. 72-88 as measured 
by computer algorithm FASTA, and a gene termination sequence to 
provide a transgenic cell: 
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(b) cultivating the transgenic cell under conditions conducive to 
regeneration and mature plant growth. 

43. The method of claim 42 wherein the non-coding region is in a sense orientation. 

44. The method of claim 42 wherein the non-coding region is in an antisense 
orientation. 

45. The method of claim 42 wherein the plant is a woody plant. 

46. The method of claim 45 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

47. A method of modifying the activity of an enzyme in a plant comprising stably 
incorporating into the genome of the plant a DNA construct including 

(a) a gene promoter sequence; 

(b) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3, 13. 16-70. 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO: 3, 13. 16-70. 72-88 as measured by computer 
algorithm FASTA; and 

(c) a gene termination sequence. 



48. The method of claim 47 wherein the open reading frame is in a sense 
orientation. 

49. The method of claim 47 wherein the open reading frame is in an antisense 
orientation. 

50. A method of modifying the activity of an enzyme in a plant comprising stably 
incorporating into the genome of the plant a DNA construct including 
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(a) a gene promoter sequence; 

(b) a non-coding region of a gene coding for an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3, 13, 16-70, 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3,13, 16-70, 72-88 as measured by computer algorithm FASTA; and 

(c) a gene termination sequence. 

5 1 . The method of claim 50 wherein the non-coding region is in a sense orientation. 

52. The method of claim 50 wherein the non-coding region is in an antisense 
orientation. 

53. The method of claim 50 wherein the plant is a woody plant. 

54. The method of claim 53 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 
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This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 

I Claims Nos.: . 
because they relate to subject matter not required to be searched by this Authority, namely: 



2 I because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specifically: 



3 " ^ because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 

see additional sheet 
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searchable claims. 
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3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
I 1 covers only those claims for which fees were paid, specifically claims Nos.: 



4. I l No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 3,17,48,49 encoding 
cinnamate 4-hydroxylase (C4H) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 

2. Claims: 1-54 partially 

Isolated DNA sequences of ID nos 18,50-52 encoding coumarate 
3-hydroxylase (C3H) , plant expression constructs 
incorporating said sequences, methods to modulate ligmn 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



3. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 35,36,81 encoding 
phenolase (PNL) , plant expression constructs incorporating 
said sequences, methods to modulate lignin content, 
structure, and enzyme activity in plants, transgenic plants 
and plant cells containing said constructs. 



4. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 22-25,53-55 encoding 
0-methyl transferase (OMT), plant .expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



5. Claims: 1-54 all partially 

Isolated DNA sequence of ID no 30, encoding cinnamyl alcohol 
dehydrogenase (CAD), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



6. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 26-29,58-70 encoding 
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cinnamoyl-CoA reductase (CCR) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



7. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 16,45-47 encoding 
phenylalanine ammonia lyase (PAL), plant expression 
constructs incorporating said sequences, methods to modulate 
lignin content, structure, and enzyme activity in plants 
using said constructs, and transgenic plants and plant cells 
containing them. 



8. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 55,57 encoding 
4-coumarate:CoA ligase (4CL), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



9. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 31-33,72 encoding coniferol 
glucosyl transferase (C6T), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



10. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 34,73-80 encoding coniferin 
beta-glucosidase (CBG) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



11. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 37-41,82-84 encoding 
laccase (LAC) .plant expression constructs incorporating 
said sequences, methods to modulate lignin content, 
structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 
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12. Claims: 1-54 all partially 

isolated DNA sequences of ID nos.l3.4Z-4j ^^-88 encoding 
Peroxidase (POX) .plant expression constructs incorporating 
Sid sequences, methods to modu ate li gn in content, 
structure, and enzyme activity in plants using "id 
constructs.and transgenic plants and plant cells containing 

them. 

13. Claims: 1-54 all partially 

tcniated DNA seauences of ID nos 19-21 encoding 
KlStEMSJu" (F5H). plant expression const™ ts 
incoroorating said sequences, methods to modulate "f" n . 
cSt structure, and enzyme activity in plants using said 
conSucts'and transgenic plants and plant cells containing 
them. 
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